Setup¶
Install the Signal Ocean SDK:
pip install signal-ocean
Set your subscription key acquired here: https://apis.signalocean.com/profile
!pip install signal-ocean
signal_ocean_api_key = '' #replace with your subscription key
Call the Voyages API¶
The Voyages API retrieves information about vessel voyages.
from signal_ocean import Connection
from signal_ocean.voyages import VoyagesAPI
import pandas as pd
from datetime import date, datetime
import plotly.express as px
connection = Connection(signal_ocean_api_key)
api = VoyagesAPI(connection)
Get vessel class id for VLCCsTankers
vessel_classes = api.get_vessel_classes()
vessel_classes_df = pd.DataFrame(v.__dict__ for v in vessel_classes)
vessel_classes_df[vessel_classes_df['vessel_type']=='Tanker'].head(7)
vessel_class_id | vessel_class_name | vessel_type_id | vessel_type | |
---|---|---|---|---|
11 | 84 | VLCC | 1 | Tanker |
12 | 85 | Suezmax | 1 | Tanker |
13 | 86 | Aframax | 1 | Tanker |
14 | 87 | Panamax | 1 | Tanker |
15 | 88 | MR2 | 1 | Tanker |
16 | 89 | MR1 | 1 | Tanker |
17 | 90 | Small | 1 | Tanker |
Get voyages for VLCCs¶
For VLCCs we will use (vessel_class_id = 84) when calling the API
voyages = api.get_voyages_condensed(vessel_class_id=84, date_from=datetime.strptime("2021-09-01", "%Y-%m-%d"))
voyages = pd.DataFrame([v.__dict__ for v in voyages])
voyages.tail()
imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10341 | 9943748 | 1 | 1 | 84 | 1 | 1645.0 | False | None | I97BAC454VEDBD3C900 | 1 | ... | 2023-08-19 19:57:33+00:00 | 173.0 | Netherlands | None | Continent | True | False | True | False | False |
10342 | 9943748 | 2 | 1 | 84 | 1 | 1645.0 | False | None | I97BAC454VEDC71FD00 | 1 | ... | 2023-11-12 03:58:16+00:00 | 112.0 | India | None | Pakistan / West Coast India | False | False | True | False | False |
10343 | 9943748 | 3 | 1 | 84 | 1 | 1645.0 | False | None | I97BAC454VEDCC11700 | 1 | ... | 2023-12-20 23:54:35+00:00 | 55.0 | China | None | Central China | False | False | False | False | False |
10344 | 9943748 | 4 | 1 | 84 | 1 | 558.0 | False | None | I97BAC454VEDD103100 | 2 | ... | 2024-02-17 15:42:08.398000+00:00 | 55.0 | China | None | North China | False | False | False | False | False |
10345 | 9946673 | 1 | 1 | 84 | 1 | 1441.0 | False | None | I97C63154VEDB84AF00 | 2 | ... | NaT | NaN | None | None | None | False | False | False | False | True |
5 rows × 97 columns
Convert Timestamp to datetime, drop voyages with first_load_sailing_date in the future (ongoing or future)
voyages['first_load_sailing_date'] = voyages['first_load_sailing_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['first_load_sailing_date']<datetime.now()]
Examine US VLCC Dirty Exports¶
Group by origin and destination (first load, last discharge), and use first_load_sailing_date as date of export
dirty_exports_country_level = voyages[(voyages['first_load_sailing_date']>"2022-01-01")&(voyages['cargo_group']=='Dirty')] \
.set_index('first_load_sailing_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'first_load_sailing_date': 'Date'})
dirty_exports_country_level.head()
first_load_country_name | last_discharge_country_name | Date | quantity | |
---|---|---|---|---|
0 | Angola | Benin | 2023-04-01 | 192000.0 |
1 | Angola | Brazil | 2022-11-01 | 260000.0 |
2 | Angola | Caribs | 2023-12-01 | 246000.0 |
3 | Angola | China | 2022-01-01 | 2601000.0 |
4 | Angola | China | 2022-02-01 | 3171000.0 |
Drop intra country flows
dirty_exports_country_level = dirty_exports_country_level[dirty_exports_country_level['first_load_country_name']!=dirty_exports_country_level['last_discharge_country_name']]
- Keep only US exports
- Aggregate all exports monthly
- Drop unnecessary columns
us_dirty_exports = dirty_exports_country_level[(dirty_exports_country_level['first_load_country_name']=='United States')].copy()
us_dirty_exports['TotalExports'] = us_dirty_exports['quantity'].groupby(us_dirty_exports['Date']).transform('sum')
us_dirty_exports.drop_duplicates(subset=['Date'], keep='last', inplace = True)
us_dirty_exports.drop(['last_discharge_country_name', 'quantity'], axis=1, inplace = True)
us_dirty_exports.head()
first_load_country_name | Date | TotalExports | |
---|---|---|---|
4276 | United States | 2022-01-01 | 6661000.0 |
4277 | United States | 2022-02-01 | 4330000.0 |
4301 | United States | 2024-02-01 | 1956000.0 |
4302 | United States | 2022-03-01 | 5702000.0 |
4303 | United States | 2022-04-01 | 5561000.0 |
Plot US VLCC Dirty exports¶
us_dirty_exports['Year'] = us_dirty_exports['Date'].apply(lambda date : date.year)
us_dirty_exports['Month'] = us_dirty_exports['Date'].apply(lambda date : date.month)
us_dirty_exports.sort_values(['Month', 'Year'], inplace = True)
us_dirty_exports['Month'] = us_dirty_exports['Date'].apply(lambda date : date.strftime("%b"))
fig = px.histogram(us_dirty_exports, x="Month", y="TotalExports", color='Year',
barmode='group', height=500, width=1000,
title='US VLCC Dirty Exports (MT)',
color_discrete_sequence=["lightgray", "gray", "lightblue"])
fig.show()
Examine China VLCC Dirty Imports
Convert Timestamp to datetime, drop uncompleted voyages (with last_discharge_arrival_date in the future)voyages['last_discharge_arrival_date'] = voyages['last_discharge_arrival_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['last_discharge_arrival_date']<datetime.now()]
Group by origin and destination (first load, last discharge), and use last_discharge_arrival_date as date of import
dirty_imports_country_level = voyages[(voyages['last_discharge_arrival_date']>"2022-01-01")&(voyages['cargo_group']=='Dirty')] \
.set_index('last_discharge_arrival_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'last_discharge_arrival_date': 'Date'})
Drop intra country flows
dirty_imports_country_level = dirty_imports_country_level[dirty_imports_country_level['first_load_country_name']!=dirty_imports_country_level['last_discharge_country_name']]
dirty_imports_country_level.head()
first_load_country_name | last_discharge_country_name | Date | quantity | |
---|---|---|---|---|
0 | Angola | Benin | 2023-05-01 | 192000.0 |
1 | Angola | Brazil | 2022-12-01 | 260000.0 |
2 | Angola | Caribs | 2024-01-01 | 246000.0 |
3 | Angola | China | 2022-01-01 | 2346000.0 |
4 | Angola | China | 2022-02-01 | 2333000.0 |
- Keep only China imports
- Aggregate all exports monthly
- Drop unnecessary columns
china_dirty_imports = dirty_imports_country_level[(dirty_imports_country_level['last_discharge_country_name']=='China')].copy()
china_dirty_imports['TotalImports'] = china_dirty_imports['quantity'].groupby(china_dirty_imports['Date']).transform('sum')
china_dirty_imports.drop_duplicates(subset=['Date'], keep='last', inplace = True)
china_dirty_imports.drop(['first_load_country_name', 'quantity'], axis=1, inplace = True)
china_dirty_imports.head()
last_discharge_country_name | Date | TotalImports | |
---|---|---|---|
3825 | China | 2022-01-01 | 29399000.0 |
3826 | China | 2022-02-01 | 26518000.0 |
3850 | China | 2024-02-01 | 9693000.0 |
4249 | China | 2022-12-01 | 33871000.0 |
4250 | China | 2023-01-01 | 29848000.0 |
Plot China VLCC Dirty Imports¶
china_dirty_imports['Year'] = china_dirty_imports['Date'].apply(lambda date : date.year)
china_dirty_imports['Month'] = china_dirty_imports['Date'].apply(lambda date : date.month)
china_dirty_imports.sort_values(['Month', 'Year'], inplace = True)
china_dirty_imports['Month'] = china_dirty_imports['Date'].apply(lambda date : date.strftime("%b"))
fig = px.histogram(china_dirty_imports, x="Month", y="TotalImports", color='Year',
barmode='group', height=500, width=1000,
title='China VLCC dirty Imports',
color_discrete_sequence=["lightgray", "gray", "lightblue"])
fig.show()
Build your custom Flows - US Clean Exports to specific countries by Panamax and MR2¶
Get vessel class ids for Tanker vessel classes
vessel_classes = api.get_vessel_classes()
vessel_classes_df = pd.DataFrame(v.__dict__ for v in vessel_classes)
vessel_classes_df[vessel_classes_df['vessel_type']=='Tanker'].head(10)
vessel_class_id | vessel_class_name | vessel_type_id | vessel_type | |
---|---|---|---|---|
11 | 84 | VLCC | 1 | Tanker |
12 | 85 | Suezmax | 1 | Tanker |
13 | 86 | Aframax | 1 | Tanker |
14 | 87 | Panamax | 1 | Tanker |
15 | 88 | MR2 | 1 | Tanker |
16 | 89 | MR1 | 1 | Tanker |
17 | 90 | Small | 1 | Tanker |
Select the vessel classes of interest¶
Get voyages for selected vessel classes (Panamax and MR2) after a given date
vessel_class_ids = (87, 88) # Get only voyages for Aframax, Panamax and MR2
date_from = datetime.strptime("2022-09-01", "%Y-%m-%d")
voyages = []
for vessel_class_id in vessel_class_ids:
voyages.append(api.get_voyages_condensed(vessel_class_id=vessel_class_id, date_from=date_from))
voyages = pd.DataFrame([v.__dict__ for vc in voyages for v in vc])
voyages.tail(5)
imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
36728 | 9970363 | 2 | 1 | 88 | 1 | NaN | False | None | I9822BB58VEDCE8A400 | 2 | ... | 2024-02-17 04:22:33.820000+00:00 | 240.0 | United States | None | US Atlantic Coast | False | False | False | False | False |
36729 | 9970375 | 1 | 1 | 88 | 1 | NaN | False | None | I9822C758VEDC998A00 | 2 | ... | 2024-02-28 21:48:29.342000+00:00 | 38.0 | Brazil | None | Brazil | True | False | False | False | False |
36730 | 9973913 | 1 | 1 | 88 | 1 | NaN | False | None | I98309958VEDD37BE00 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
36731 | 9974981 | 1 | 1 | 88 | 1 | NaN | False | None | I9834C558VEDD103100 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
36732 | 9983889 | 1 | 1 | 88 | 1 | 1216.0 | False | None | I98579158VEDD37BE00 | 2 | ... | NaT | NaN | None | None | None | True | False | False | False | True |
5 rows × 97 columns
Date of flows will be based on the export date
- Convert Timestamp to datetime, drop voyages with first_load_sailing_date in the future (ongoing or future)
voyages['first_load_sailing_date'] = voyages['first_load_sailing_date'].apply(lambda d : pd.to_datetime(d).tz_localize(None))
voyages=voyages[voyages['first_load_sailing_date']<datetime.now()]
voyages.head()
imo | voyage_number | vessel_type_id | vessel_class_id | vessel_status_id | commercial_operator_id | deleted | events | id | horizon_id | ... | last_discharge_sailing_date | last_discharge_country_id | last_discharge_country_name | last_discharge_area_id_level0 | last_discharge_area_name_level0 | repairs_ind | storage_ind | sts_load_ind | sts_discharge_ind | local_trade_ind | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 9063108 | 207 | 1 | 87 | 1 | 356.0 | False | None | I8A4AC4VEDA976100 | 1 | ... | 2022-09-16 04:50:52+00:00 | 55.0 | China | None | South China | False | False | False | False | True |
1 | 9063108 | 208 | 1 | 87 | 1 | 356.0 | False | None | I8A4AC4VEDAB1BF00 | 1 | ... | 2022-10-06 23:01:32.800000+00:00 | 55.0 | China | None | South China | False | False | False | False | True |
2 | 9063108 | 209 | 1 | 87 | 1 | 356.0 | False | None | I8A4AC4VEDABEEE00 | 1 | ... | 2022-10-24 03:57:34+00:00 | 55.0 | China | None | South China | False | False | False | False | True |
3 | 9063108 | 210 | 1 | 87 | 1 | 356.0 | False | None | I8A4AC4VEDAE67B00 | 1 | ... | 2022-11-09 03:33:00+00:00 | 55.0 | China | None | South China | False | False | False | False | True |
4 | 9063108 | 211 | 1 | 87 | 1 | 356.0 | False | None | I8A4AC4VEDAF3AA00 | 1 | ... | 2022-12-05 07:47:48+00:00 | 55.0 | China | None | South China | False | False | False | False | True |
5 rows × 97 columns
Select the cargo types of interest¶
Examine different cargo types under the "Clean" cargo group
voyages[voyages['cargo_group']=='Clean'].drop_duplicates(subset=['cargo_group', 'cargo_type'], keep='last')[['cargo_group', 'cargo_type']].head()
cargo_group | cargo_type | |
---|---|---|
36029 | Clean | Mixed Aromatics |
36410 | Clean | Clean Condensate |
36614 | Clean | Intermediate |
36647 | Clean | Gasoline Blending Component |
36697 | Clean | Jet |
voyages[(voyages['first_load_country_name']=='United States')&(voyages['cargo_group']=='Clean')]['cargo_type'].value_counts().head(10)
Gasoline 2319 Diesel 1932 Gasoil 354 Jet 334 Naphtha 270 Gasoline Blending Component 120 Intermediate 15 Mixed Aromatics 4 Clean Condensate 2 Name: cargo_type, dtype: int64
- Keep cargo types of interest
- Group by origin and destination (first load, last discharge) and cargo type
- Use last_discharge_arrival_date as date of flow
cargo_types_of_interest = ('Gasoline', 'Diesel', 'Ultra Low Sulphur Diesel', 'Gasoil', 'Naphtha', 'Unleaded Motor Spirit', 'Jet')
custom_flows_country_level = voyages[(voyages['first_load_sailing_date']>"2023-01-01")&(voyages['cargo_type'].isin(cargo_types_of_interest))] \
.set_index('first_load_sailing_date') \
.groupby(['first_load_country_name', 'last_discharge_country_name', 'cargo_type'])['quantity'] \
.resample('MS') \
.sum() \
.reset_index() \
.rename(columns={'first_load_sailing_date': 'Date'})
custom_flows_country_level.head()
first_load_country_name | last_discharge_country_name | cargo_type | Date | quantity | |
---|---|---|---|---|---|
0 | Albania | Slovenia | Diesel | 2023-08-01 | 30000.0 |
1 | Algeria | Belgium | Gasoline | 2023-04-01 | 37000.0 |
2 | Algeria | Belgium | Jet | 2023-02-01 | 37000.0 |
3 | Algeria | Belgium | Naphtha | 2023-02-01 | 146000.0 |
4 | Algeria | Belgium | Naphtha | 2023-03-01 | 37000.0 |
Select flows of interest¶
Keep only flows between US and (Mexico, Brazil, Argentina)
import_countries = ('Mexico', 'Brazil', 'Argentina')
custom_flows = custom_flows_country_level[(custom_flows_country_level['first_load_country_name']=='United States')&(custom_flows_country_level['last_discharge_country_name'].isin(import_countries))].copy()
custom_flows.head()
first_load_country_name | last_discharge_country_name | cargo_type | Date | quantity | |
---|---|---|---|---|---|
12798 | United States | Argentina | Diesel | 2023-01-01 | 76000.0 |
12799 | United States | Argentina | Diesel | 2023-02-01 | 97000.0 |
12800 | United States | Argentina | Diesel | 2023-03-01 | 60000.0 |
12801 | United States | Argentina | Diesel | 2023-04-01 | 98000.0 |
12802 | United States | Argentina | Diesel | 2023-05-01 | 108000.0 |
Plot custom flows - US Clean Exports to specific countries¶
custom_flows['Year'] = custom_flows['Date'].apply(lambda date : date.year)
custom_flows['Month'] = custom_flows['Date'].apply(lambda date : date.month)
custom_flows.sort_values(['Year', 'Month'], inplace = True)
custom_flows['Month'] = custom_flows['Date'].apply(lambda date : date.strftime("%b"))
custom_flows['Year'] = custom_flows['Date'].apply(lambda date : date.strftime("%Y"))
custom_flows['MonthYear'] = custom_flows['Month'].astype(str) + ' ' + custom_flows['Year'].astype(str)
fig = px.histogram(custom_flows, x="MonthYear", y="quantity", color='cargo_type',
height=500, width=1000, title='custom Flows - US Clean Exports to (Mexico, Brazil and Argentina)',
color_discrete_sequence=px.colors.qualitative.Pastel)
fig.update_layout(bargap=0.2)
fig.show()