To SQL or to NoSQL (with a simple NoSQL Example)

In this article I will cite reasons to use SQL and reasons not to do so, followed by a simple NoSQL example where we convert data from CSV to JSON and then render it in plain javascript. Our data source is a serverless document uploaded into AWS S3.

For years I used MySQL and/or SQLite. Both required a structured environment on a server. There was nothing else back then because noSQL was not widely used so we had no choice but to use a structured environment, a relational database. More recently we had choices. While exploring our NoSQL option we also get to go serverless by eliminating the mySQL server and using only a document as the data source. Going serverless is an added benefit of using noSQL.

Many times our use case only requires a flat file format with a few metrics on a small dashboard. Our requirement is not asking for table joins, nothing fancy. So would a document format suffice? I think yes. Will plain JSON do? I also think yes mainly because while rendering html, JSON is the defacto format for handling data so why not just use a JSON formatted table as a single file data source.

While exploring JSON I discovered plain javascript array methods to replace the SQL query engine. I was off to the races with noSQL and never turned back. I now have my data in JSON and my CRUD operations in plain vanilla javascript. NoSQL has really simplified my programming activities.

1 where we use many tables in a structured environment and we establish relationships between these tables we will need to use SQL

2 to do date math we need SQL. JSON does not support dates but can store them in string format. So if ur app needs to calculate date or time math use postgreSQL instead. Also, ps, SQLite does not have a date object either.

3 because back in the day this is all there was so us old folks can’t get away from using SQL

4 Transactional data, where there are constant updates by many users and concurrent users, we need SQL for this use case. We want a reliable transaction. Document format is not good for this because it is only eventually consistent, meaning not in real time. Transactions, updated frequently and accessed by many concurrent users, need fail proof reliability. We can not risk two users seeing different data sets because our document format eventually updates vs updating immediately.

For concurrent users, updating data needs to be in real time, with noSQL there can be a delay due to file caching. SQL uses a commit approach where we commit our transaction so the next user sees it right away. The database is held open for the transaction to finish and other user cannot access data until after this commit.

6 another use case would be if we have external data in other formats that we want to migrate to postgres. For data migration we need to use SQL.

7 if we are using query tools like pgAdmin or DBeaver in a schema, then the structured data is required.

8 Another reason for using SQL is if we want columns and rows. If we are used to the spreadsheet format, then we iterate over each row column by column. With noSQL we are in a document format and columns and rows do not exist. Instead we have an array of keys and values (the key value pair)

11 A big reason to use SQL is because lots of large corporations are using Postgres where JSON lives alongside SQL. So at minimum we must have basic knowledge of both SQL and noSQL.

1 Use NoSQL when one flat file will do for our solution. Many times we find ourselves gravitating to the one table model, just for convenience. In this model we may be doing some light filtering on the data or grouping but no heavy SQL queries and relationships. So this could be the scenario to consider the NoSql model. For the simple ad hoc request we don't need the overkill of SQL for our data solution. Our needs are just not that big. And rightfully so, we should keep things simple. Remember the kiss principle?

2 so we can access and manipulate our data with js query methods and thus avoid using a SQL tool or the query engine which is high overhead. This overhead is infrastructure we just don't need for simple noSQL.

3 If u are dealing with customer master data then NoSQL is for u. Master data usually lives on one master table.

4 a frontend is always needed in an application so this is why I like to circumvent the SQL query tool and build queries in plain js. The frontend uses JSON anyway as a data format so why not just build our table in JSON.

7 Finding the value in a key-value pair is lightning fast so using noSQL as our solution.

8 Another reason to noSQL is when we have no means to access a data server or were we are not granted permission anywhere or are not part of an IT group…we can create our own document, store it, call it a db and bypass the server credentialing part. In AWS S3 we grant access to all viewers. Example: a marketing manager may want to show his group some data every month to see product line growth.

9 Another reason for noSQL is when our audience is just viewers and not contributors thus all they need is the dashboard view, they do not need access to a schema.

10 use noSQL so we can be serverless. A document database does not live on a server because it is merely a text file with a .JSON extension.

12 it is very convenient for presentations to be all on the client side so we just provide an html link to the user. We drastically reduce all the moving pieces of our application architecture when we are down to just html and plain js with a simple document db.

13 accessing data thru javascript and noSQL seems much easier than doing it in java or python which can lead to a deep dive into infrastructure.

14 In addition to use case reasons for noSQL we have a structural reason..ie to do rearchitecting. An added benefit of using NoSQl is that we get to go serverless, we get rid of all the structural overhead, the servers which can be overkill in small projects, We also become up to date with current technology.

15 The thing i like about noSQL is that u can do it without any tools, just a simple notepad will do. For SQL we need a query engine so u need to download something which could also mean u may not be serverless.

16 Another reason to be in noSQL is that one can be independent of the IT department and still maintain and manipulate the IT data.

Scenario: We are in a busy marketing department and we receive many reports but we want our own format, just want a few dashboards for monthly data analysis. We can circumvent the IT department by slapping together a frontend dashboard in javascript that accesses data from a document stored, not in a formal db, but rather in an aws s3 bucket. We only want a few metrics and where data is managed by one person, a non transactional setting. Thus as a marketing exec u can manage ur own database without needing to pay hugh amounts to an IT person or starting a big money consuming project. Also by being independent from IT we gain our own valuable knowledge about the array data structure and how to manage it. Bottom line our marketing department is saving a lot of time and money by circumventing the IT department.

In the above scenario we can still feed off a customer master by downloading a copy of the needed data to csv then converting it to JSON. So when u get ur report from IT u can then ask for it in csv format. They provide it, we get rid of unwanted columns and then transform it to JSON for use in a dashboard. Mission accomplished.

So then back to the question, is it SQL or NoSQL or perhaps a combo of both. For combo we can use Postgres as it was one of the first relational dbs to do NoSQL as an option.

Postgres is a hybrid approach using both structured SQL and unstructured noSQL. In the real world it is more likely that we will have a lot of disparate data so we should have a good understanding of both SQL and noSQL. Most data migration projects will also necessitate understanding both paradigms.

Converting CSV data to JSON format - lets migrate some data now - here we are staging our data. The following example uses noSQL without a server or query engine. It is a straight JSON file saved in an AWS s3 bucket. We manipulate the data strictly using javascript array methods (in a separate tutorial). Other noSQL databases have their own proprietary architecture with a server and query engine like Mongodb and Cassandra. I am building only the data portion of the noSQL in this example.

So now that we have asked out IT department for some csv data we need to convert it to noSQL with a JSON format. Json is a simple text file stored as a document with a dot JSON extension as in myfile.json. JSON is a comma separated array of keys and values, the key-value pair. It is readable from a web client by looping through the array, an iterable object. For both SQL and noSQL array is our most important data structure.

We are creating a row object (aka a record) for every line in noSQL, and then we comma delimit our rows. Each pair is a db column with its value (key value pair) and several columns make up one row. The collection of rows is bracketed.

The following is some code in javascript to convert data from csv to JSON. I created a small spreadsheet in google sheets and exported it to csv for this conversion.

<!DOCTYPE html>

<html>

<head>

<title>CSV to JSON Converter in HTML5</title>

</head>

<body>

<h1>CSV to JSON Conversion</h1>

<p>instructions: after selecting csv file, a link to json will appear below. Click on this link to see the converted json code, then copy and paste it to notepad, save as your_file_name.json, then go to AWS s3 bucket and upload this new file into s3</p>

<button onclick="document.querySelector('input').click()">Select CSV File</button>

var output = document.getElementById("output");

//var csv is the CSV file with headers, on click of button, filereader below reads this csv file

function csvJSON(csv){

var lines=csv.split("\n"); //use split method on each row in spreadsheet where \n is the line terminator

var result = []; //returns this csv line by line as an array

var headers=lines[0].split(","); //headers are at index position 0...getting headers first, get only line at position 0

//read each line from csv file and write to result array

for(var i=1;i<lines.length;i++){

var obj = {};

var currentline=lines[i].split(",");

for(var j=0;j<headers.length;j++){

obj[headers[j]] = currentline[j];

}

result.push(obj); //push each obj with brackets separated by comma into result object which becomes the json formatted array

//then return the stringified result below

}

//return result; //as a JavaScript object

//while stringifying need to find and replace \r which shows up in my result as unwanted character, added 8-23-22

return JSON.stringify(result).replaceAll("\\r", ''); //JSON..note need to escape \r with another \ in order to match it, then use replaceAll because replace method only does first occurrance

//also can use stringify without the replace method if source data is clean, return JSON.stringify(result);

}

/* main upload function, this returns a link called json file, press this to get json code

we are creating a filereader to read our result from above

function upload(file) {

if( file.type.match(/text\/csv/) || file.type.match(/vnd\.ms-excel/) ){

FileReader = new FileReader();

FileReader.onloadend = function() {

var json = csvJSON(this.result);

var blob = new Blob([json], {type: 'application/json'});

var url = URL.createObjectURL(blob);

//passes final json array code to output div

output.innerHTML = '<br><a href="'+url+'">JSON file</a>'; //return url object as a link which u need to click

};

FileReader.readAsText(file);

} else {

console.log("This file does not seem to be a CSV.");

}

</script>

</body>

</html>

The rendering solution - After creating our JSON file we upload it to an AWS S3 bucket and then render it in html with a simple javascript fetch. Below is some code to do it. This example pulls some actual data from my s3 bucket. This data can also be viewed in our Stacked Bar Chart Sales Orders Dashboard.

We iterate over the JSON array pulling its values into our javascript table object, then we get its innerhtml which contains the data. This is how data is transmitted from the source to the javascript frontend.

<!DOCTYPE html>

<html>

<head>

<title>view orders.json in html</title>

</head>

<body>

<h1>View JSON Data in Javascript table</h1>

</body>

function CreateTableFromJSON() {

var myOrders;

//here we replace axios with fetch, since this fetch is a GET request it is different from our post request used in my_plain_js

fetch('https://rickd.s3.us-east-2.amazonaws.com/orders.json') //getting table stored in AWS s3 bucket

.then(response => {

return response.json();

}).then(data => {

//next in promise chain, this must use a different name for the function, named as data, orig in chain name is response

// Work with JSON data below

myOrders = data; //in the data section of promise chain we replace data with myOrders var so we can work with it

const table = document.createElement("table");

const tbody = document.createElement("tbody");

table.appendChild(tbody);

let data2 = Object.keys(myOrders[0]); //get header vals from myOrders row 0 (get vals of csv header row0)

//row zero is not a key pair like body of array starting at row1

//Object.keys() returns an array whose elements are strings corresponding to the enumerable string-keyed property names found directly upon object....we are returning only the keys used in the array (the array has key pairs key:val)

console.log("object-keys = "+data2);

generateTableHead(table, data2); //then run the gen head function before the body function

//gets table header

function generateTableHead(table, data2) {

let thead = table.createTHead();

let row2 = thead.insertRow();

for (let key of data2) {

//a for-of loop where data2 is an iterable, let key of the iterable where key = name assigned

let th = document.createElement("th");

let text = document.createTextNode(key);

th.appendChild(text); //pass iterable named key into header with textnode format (grab only the text from json, escaping html chars of the string)

row2.appendChild(th);

}

//next gets table body

//myOrders.forEach((obj, idx) => {

myOrders.forEach((obj) => {

let row = tbody.insertRow();

Object.values(obj).forEach(val => {

//object.vals vs object.keys - vals returns array values excluding row0, returning only the vals of each key in the body

let cell = row.insertCell();

cell.textContent = val;

});

document.body.appendChild(table); //display the whole fragment

//end of .then(data part of the promise chainchain

}).catch(err => {

// Do something for an error here

});

} //end of CreateTable method

</script>

</html>

2. If u want to insert or update to ur s3 table then u need an AWS Lambda function

To SQL or to NoSQL - including a simple NoSQL Example

scroll down for a javascript code sample - the 2nd snippet is a live demo