How to use JSON, SODA, Python and Oracle Autonomous JSON Database – Part 2

This is the second part of a 3 part series.

In the first post of this series we saw how we can connect a Python program to an Oracle Database. We also explored how we could create a table with a JSON document using SQL calls in Python. We are now ready to move onto SODA… this is where the real power of using Oracle as a JSON store comes into play.

NOTE: all code samples in this Blog entry is available github-

The sample code used in this blog entry can be downloaded from Github: https://github.com/seantstacey/SODA

1. Introducing SODA

In this post we are exploring the SODA APIs for Python. But you should be aware that SODA is also available for Java, REST, C and Node.js

So what is exactly is SODA? it’s best described as follows (taken from the Oracle documentation):

“Simple Oracle Document Access (SODA) is a set of NoSQL-style APIs that let you create and store collections of documents (in particular JSON) in Oracle Database, retrieve them, and query them, without needing to know Structured Query Language (SQL) or how the documents are stored in the database.”

SODA documentation link: https://docs.oracle.com/en/database/oracle/simple-oracle-document-access/python/index.html

Before we get started, for those of you with a relational background who are not familiar with collections and document stores here’s a cross-reference to get you started:

Document: A Document is the equivalent to a relational row.

Collection: A Collection is the equivalent to a relational table. A Collection contains multiple Documents.

Database: A Database contains multiple collections and is the equivalent of a Schema in Oracle.

If you’re already familiar with the schema-less model then this interface should appeal to you. On the other hand if you’re more familiar with the relational style of data modeling, then you’re in for an interesting ride…

In the Schema-less approach to creating a collection, your code can create a collection without requiring any metadata definition or DDL (Data Definition Language). Furthermore, it is possible to insert a document in a pre-created collection that does not match any previously inserted documents (or Schema Evolution).

One last thing about SODA. It is important to understand that SODA is NOT mandatory when working with JSON and the Oracle Database, as we saw in part 1 of the series. However, as you will soon see, SODA provides a very nice solution to bridging the programing gap between the NOSQL model and the relational model.

2. SODA in Action

To understand this properly let’s see a simple example. The document in this series contains some animal classification data…

source code: https://github.com/seantstacey/SODA/blob/main/soda-one.py

The first thing you should notice is that we do not need to import any library or additional modules to use SODA. It’s functionality is baked into cx_Oracle and the Oracle database engine, meaning that it’s quick and easy to use straight out of the box.

NOTE: With that being said, you will need to grant your database user the SODA_APP role from inside the Oracle database.

Before running the script we can verify that there are no existing objects in the database called “sodacollection”. This is important because the python script simply calls: soda.createCollection(“sodacollection”). We have not specified any data definition to define the collection or database table or columns for the mapping.

So here’s the SQL script we will use to view the contents of the collection called “sodacollection”:

source code: https://github.com/seantstacey/SODA/blob/main/soda_query.sql

And here’s what it returns when we run the SQL statement before running the python program.

Yes those error messages are expected, at this stage we should not see anything related to our collection (table) called “sodacollection”. We are now ready to run the first python-SODA script: soda-one.py

The script creates a simple collection containing a single document using SODA. As already mentioned, this is intentionally a basic code sample as the main goal here is to demonstrate how easy it is to get started with SODA.

You can see when we ran the python script it created a collection with a single JSON document containing information about the marsupial “Koala” along with an ID for the document.

Now let’s see what the collection object looks like from inside the database as well as the document itself. The SQL below shows the table created for the collection, along with a simple SQL query of the table to return its contents. The third query uses the JSON_SERIALIZE function to display the contents of the document. We are returning this using “pretty format” as opposed to a single row.

As neat as this is, a more realistic approach is to perform a bulk insert of multiple documents. So let’s take a look at how this is done.

This time we are appending 5 more documents to the existing collection. But this is where things get interesting. In the schema-less world, it’s possible to insert a document that does not map or conform to a previously defined format (schema evolution). So in our insert we are adding an additional element in our document. For this example we will focus on the “Inland Taipan” it is an extremely venomous snake so we are going to add an element to capture that classification information. Working with this late modification requires no changes to our underlying data model and the Oracle database can handle this last minute change without any issues what-so-ever. Finally in the last part of the sample we are running a Query-By-Example, we’ll discuss more about that in the next section.

source code: https://github.com/seantstacey/SODA/blob/main/soda-two.py

3. Query by Example aka QBE

So, what is QBE? You may notice that the last section of our script introduces Query-By-Example, let’s take a look at what we’re doing here. As you would expect, it is possible to search (or “filter”) the collection for a specific value in this case: “marsupials”. You may recall in soda-one.py, “Koala” was the first animal to be cataloged and it is a “marsupial”, so it should also be included in our search results. To perform a search in our collection we are filtering for all “infraclass” that begin with “marsupial%”.

Here’s what it looks like when we run the script: soda-two.py

We see three marsupials are returned: Platypus, Red Kangaroo and of course Koala. You will also notice the order or sequence does not match how they were inserted into the collection.

Now is a good time to take a look at what the collection looks like when we query the collection from inside the Oracle database. To make it easier to follow this sample, we are just returning a single row from the table called: sodacollection. You should be able to see the correlation between this document, id, name, etc and the output from the Python script.

4. Dropping a Collection:

The last thing you need to be aware of when working with SODA collections in Oracle is that there is hidden metadata associated with the collections you create in the database. So it is important to note that you should not delete an underlying table for a collection from the database using a SQL drop operation, as you will not cleanly delete all the metadata associated with that table (or collection). To be clear, there is nothing preventing you from deleting a collection in the database, however you will not be able to re-use the collection name again as there will still be metadata associated with that name and your python program will throw an error when it attempts to create the collection with the same previously used name.

Hence, the safest way to delete a collection using SQL from inside the database is using the SQL function: dbms_soda.drop_Collection

source code: https://github.com/seantstacey/SODA/blob/main/dropSodaCollection.sql

In this example we can see the function returns a value of “1” when it succeeds and “0” if it fails (as was the case when we ran it for a second time). In other words it is an idempotent operation.

NOTE: In part 3 of the series we will see how we can drop our collection using python.

That ends our second part in this series.

In the next part of this series we will be exploring the use of what we have learned so far, but applying it to an Autonomous JSON Database instead of a local RDBMS.

The sample code used in this blog entry is available for download from Github: https://github.com/seantstacey/SODA