Persisting data in SQLite¶

Run this example in Colab.¶

Setup¶

Install the Signal Ocean SDK:

In [ ]:

Copied!

!pip install signal-ocean
!pip install signal-ocean

Set your subscription key, acquired here: https://apis.signalocean.com/profile

In [2]:

Copied!

signal_ocean_api_key = ""  # replace with your subscription key
signal_ocean_api_key = ""  # replace with your subscription key

Description¶

This notebook demonstrates a way of persisting historical tonnage list data into a local SQLite database.

SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.

Source: https://www.sqlite.org/index.html

Python has a built-in sqlite3 module that allows us to work with SQLite. There is no installation and configuration process; in order to use SQLite, we only need to import the module:

In [3]:

Copied!

import sqlite3
import sqlite3

Step 1 - Create the database¶

Create a database named HistoricalTonnageList.db and connect to it:

In [4]:

Copied!

db_connection = sqlite3.connect("HistoricalTonnageList.db")
db_connection = sqlite3.connect("HistoricalTonnageList.db")

Note that a HistoricalTonnageList.db file has been created in the current working directory. This is the database that we have just created and connected to. Running this code a second time will not result in a new database file being created; it will simply connect to the existing database with the specified name.

Step 2 - Query the Tonnage List API¶

For this example, we are going to retrieve data for all Aframax vessels for the last seven days.

In [5]:

Copied!





from datetime import date, timedelta

from signal_ocean import Connection
from signal_ocean.tonnage_list import (
    TonnageListAPI,
    VesselClassFilter,
    PortFilter,
    DateRange,
)


api_connection = Connection(signal_ocean_api_key)
api = TonnageListAPI(api_connection)

vessel_class_filter = VesselClassFilter(name_like="aframax")
vessel_class = api.get_vessel_classes(vessel_class_filter)[0]

port_filter = PortFilter(name_like="ceyhan")
port = api.get_ports(port_filter)[0]

laycan_end_in_days = 512
start_date = date.today() - timedelta(days=7)
end_date = date.today()

htl = api.get_historical_tonnage_list(
    port, vessel_class, laycan_end_in_days, DateRange(start_date, end_date),
)

data_frame = htl.to_data_frame()
print(data_frame.shape)
data_frame.head()
from datetime import date, timedelta

from signal_ocean import Connection
from signal_ocean.tonnage_list import (
    TonnageListAPI,
    VesselClassFilter,
    PortFilter,
    DateRange,
)


api_connection = Connection(signal_ocean_api_key)
api = TonnageListAPI(api_connection)

vessel_class_filter = VesselClassFilter(name_like="aframax")
vessel_class = api.get_vessel_classes(vessel_class_filter)[0]

port_filter = PortFilter(name_like="ceyhan")
port = api.get_ports(port_filter)[0]

laycan_end_in_days = 512
start_date = date.today() - timedelta(days=7)
end_date = date.today()

htl = api.get_historical_tonnage_list(
    port, vessel_class, laycan_end_in_days, DateRange(start_date, end_date),
)

data_frame = htl.to_data_frame()
print(data_frame.shape)
data_frame.head()

(7613, 23)

Out[5]:

		name	vessel_class	ice_class	year_built	deadweight	length_overall	breadth_extreme	subclass	market_deployment_point_in_time	push_type_point_in_time	...	commercial_operator_point_in_time	commercial_status_point_in_time	eta_point_in_time	latest_ais_point_in_time	open_prediction_accuracy_point_in_time	open_country_point_in_time	open_narrow_area_point_in_time	open_wide_area_point_in_time	availability_port_type_point_in_time	availability_date_type_point_in_time
date	imo
2021-10-12 12:00:00+00:00	9486910	Beta	Aframax	NaN	2010	105319	228.60	42	Dirty	Contract	Not Pushed	...	Held Maritime Shipping	Available	2019-10-27 09:00:00+00:00	2019-10-13 04:47:29+00:00	Narrow Area	Netherlands	Continent	UK Continent	Prediction	Prediction
	9138604	Barakuda Natuna	Aframax	NaN	1998	109277	244.60	42	Clean	Spot	Not Pushed	...	Soechi Lines	Available	2020-09-12 17:00:00+00:00	2020-08-23 01:45:14+00:00	Narrow Area	Malaysia	Singapore / Malaysia	South East Asia	Source	Source
	9038830	Proteo	Aframax	NaN	1993	99392	243.97	46	Dirty	Program	Not Pushed	...	PDVSA	Available	2021-01-12 04:00:00+00:00	2020-12-22 22:32:34+00:00	Narrow Area	Venezuela, Bolivarian Republic of	Caribs	Caribs	Source	Source
	8508292	Gunung Kemala	Aframax	NaN	1986	86962	242.00	42	Dirty	Program	Not Pushed	...	Pertamina	Available	2021-03-09 07:00:00+00:00	2021-02-09 16:04:53+00:00	Narrow Area	Indonesia	Indonesia	South East Asia	Prediction	Prediction
	9038854	Warrior King	Aframax	NaN	1993	99371	243.97	46	Dirty	Program	Not Pushed	...	PDVSA	Available	2021-07-02 06:00:00+00:00	2021-06-06 21:34:32+00:00	Narrow Area	Curacao	Caribs	Caribs	Prediction	Prediction

5 rows × 23 columns

Step 3 - Import the data into the database¶

In order to import the data we have to install sqlalchemy:

In [ ]:

Copied!

!pip install sqlalchemy
!pip install sqlalchemy

We'll use SQLAlchemy to create a table for our historical tonnage list:

In [7]:

Copied!

import os

from sqlalchemy import create_engine

engine = create_engine(
    f'sqlite:///{os.path.join(os.path.abspath(os.getcwd()), "HistoricalTonnageList.db")}'
)

data_frame.to_sql("htl_ceyhan_aframax", engine, index=True, if_exists="append")
import os

from sqlalchemy import create_engine

engine = create_engine(
    f'sqlite:///{os.path.join(os.path.abspath(os.getcwd()), "HistoricalTonnageList.db")}'
)

data_frame.to_sql("htl_ceyhan_aframax", engine, index=True, if_exists="append")

Step 4 - Verify that the data has been correctly imported¶

In [8]:

Copied!

import pandas as pd

data_frame_from_db = pd.read_sql_query(
    "select * from htl_ceyhan_aframax", db_connection
)

data_frame_from_db.head()
import pandas as pd

data_frame_from_db = pd.read_sql_query(
    "select * from htl_ceyhan_aframax", db_connection
)

data_frame_from_db.head()

Out[8]:

	date	imo	name	vessel_class	ice_class	year_built	deadweight	length_overall	breadth_extreme	subclass	...	commercial_operator_point_in_time	commercial_status_point_in_time	eta_point_in_time	latest_ais_point_in_time	open_prediction_accuracy_point_in_time	open_country_point_in_time	open_narrow_area_point_in_time	open_wide_area_point_in_time	availability_port_type_point_in_time	availability_date_type_point_in_time
0	2021-10-12 12:00:00.000000	9486910	Beta	Aframax	None	2010	105319	228.60	42	Dirty	...	Held Maritime Shipping	Available	2019-10-27 09:00:00.000000	2019-10-13 04:47:29.000000	Narrow Area	Netherlands	Continent	UK Continent	Prediction	Prediction
1	2021-10-12 12:00:00.000000	9138604	Barakuda Natuna	Aframax	None	1998	109277	244.60	42	Clean	...	Soechi Lines	Available	2020-09-12 17:00:00.000000	2020-08-23 01:45:14.000000	Narrow Area	Malaysia	Singapore / Malaysia	South East Asia	Source	Source
2	2021-10-12 12:00:00.000000	9038830	Proteo	Aframax	None	1993	99392	243.97	46	Dirty	...	PDVSA	Available	2021-01-12 04:00:00.000000	2020-12-22 22:32:34.000000	Narrow Area	Venezuela, Bolivarian Republic of	Caribs	Caribs	Source	Source
3	2021-10-12 12:00:00.000000	8508292	Gunung Kemala	Aframax	None	1986	86962	242.00	42	Dirty	...	Pertamina	Available	2021-03-09 07:00:00.000000	2021-02-09 16:04:53.000000	Narrow Area	Indonesia	Indonesia	South East Asia	Prediction	Prediction
4	2021-10-12 12:00:00.000000	9038854	Warrior King	Aframax	None	1993	99371	243.97	46	Dirty	...	PDVSA	Available	2021-07-02 06:00:00.000000	2021-06-06 21:34:32.000000	Narrow Area	Curacao	Caribs	Caribs	Prediction	Prediction

5 rows × 25 columns

Step 5 - Update the created table¶

As time passes and new historical tonnage lists become available, we want to append their contents to our table.

To do this, we can query the data we already have stored for the most recent historical tonnage list's date:

In [9]:

Copied!

cursor = db_connection.cursor()
cursor.execute("select max(date(date)) from htl_ceyhan_aframax")
max_date_in_table = cursor.fetchone()[0]

max_date_in_table
cursor = db_connection.cursor()
cursor.execute("select max(date(date)) from htl_ceyhan_aframax")
max_date_in_table = cursor.fetchone()[0]

max_date_in_table

Out[9]:

'2021-10-12'

Then, we can use that date as the start_date for further historical tonnage list requests:

In [10]:

Copied!





start_date = date.fromisoformat(max_date_in_table)
end_date = date.today()

htl = api.get_historical_tonnage_list(
    port, vessel_class, laycan_end_in_days, DateRange(start_date, end_date),
)

data_frame = htl.to_data_frame()
start_date = date.fromisoformat(max_date_in_table)
end_date = date.today()

htl = api.get_historical_tonnage_list(
    port, vessel_class, laycan_end_in_days, DateRange(start_date, end_date),
)

data_frame = htl.to_data_frame()

In order to append them to our table:

In [11]:

Copied!

data_frame.to_sql("htl_ceyhan_aframax", engine, index=True, if_exists="append")
data_frame.to_sql("htl_ceyhan_aframax", engine, index=True, if_exists="append")