Software Architecture
Processing JSONL
Thursday, January 13, 2022
Layer One - Team Lead - Andrew Schwabe
In today's world, the representation of data comes in many formats. Some have well-defined schema and tight constraints. Others are wide open with no constraints. The latter is especially true when developing an integration channel between platforms. One data export format gaining popularity is JSONL (AKA JSON Lines).
What is JSONL?
The JSONL format is similar to a JSON array, where there are many objects in a collection. What differentiates the JSONL format from a JSON array is the lack of array constructs around each JSON object. To start there is no open square bracket to start the collection, there is no closing square bracket to end the collection. Between each object in the collection, there are no commas. In JSONL the delimiter between each object is a new-line character. Much how a CSV object is delimited. For example:
{ "firstName": "Amanda", "lastName": "Smith" }
{ "firstName": "Josh", "lastName": "Johnson" }
{ "firstName": "Daniel", "lastName": "Garcia" }
How do I parse this data?
Parsing this data in a modern tool will yield an invalid JSON exception, as this is invalid JSON. So, what can be done to parse this data to consume it? In .NET there are 2 main options. The first would be iterating through each line in the file parsing the JSON with the System.Text.Json tools. A more elegant solution is to use NewtonSoft.Json with some configuration.
using (var stream = new StreamReader(file, Encoding.UTF8))
using (var reader = new JsonTextReader(stream))
{
reader.SupportMultipleContent = true;
var serializer = JsonSerializer.Create();
while (reader.Read())
{
var person = serializer.Deserialize<Person>(reader);
if (person != null) people.Add(person);
}
}
Passing the data stream to instantiate JsonTextReader allows the consumer to then configure the reader that there are multiple JSON objects within the data stream.
The team here has put together three .NET 6 command-line tools to get you started writing and reading JSONL formatted data.
Interested in learning more?
Contact Us