Historical Cargo Distribution in a Level 1 Area¶
Run this example in Colab¶
APIs Used : Scraped Cargoes API, Geos API
Description :
In this Notebook the main goal is to find and display the Cargo Supply in a given Level 1 Area over a specific time window.
The script walks though the installation of the signal ocean SDK and import of the required dependencies used for the processing of the data.
Also the parameters
vessel_type_id, days_back, Area_Level_1
are initialized, in order to be used to achieve the desired output.
Lastly, we display the chart of the data that we retrieved and present them as time series.
Output : Time-Series Graph displaying the public supply of cargoes in a Level 1 Area for a specific time frame. More on area levels here.
Setup¶
Install the Signal Ocean package
%%capture
%pip install signal-ocean
Import signal_ocean
and other modules required for this demo
from signal_ocean import Connection
from signal_ocean.scraped_cargoes import ScrapedCargoesAPI, ScrapedCargo
from signal_ocean.geos import GeosAPI
from datetime import datetime, timedelta
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 50)
Parameters Setup¶
To get all tanker cargoes received the last X days, we must declare vessel_type_id = 1
and received_date_from
variables
signal_ocean_api_key = '' # Replace with your subscription key
vessel_type_id = 1 # Tanker
days_back = 20
received_date_from = datetime.utcnow() - timedelta(days=days_back)
Area_level_1 = 'US Gulf & Mainland' # Area Level 1 that we want to visualize
Connection¶
Create a new instance of the ScrapedCargoesAPI
and GeosAPI
classes
connection = Connection(signal_ocean_api_key)
scraped_cargoes_api = ScrapedCargoesAPI(connection)
geos_api = GeosAPI(connection)
Now we are ready to retrieve our data
Main CodeBlock¶
Scraped Cargoes API: Fetch Data and Deduplicate¶
You may also find more on our Scraped Cargoes API here, including documentation of the object methods used and more examples.
Call get_cargoes
method, as below
scraped_cargoes = scraped_cargoes_api.get_cargoes(vessel_type = vessel_type_id,received_date_from = received_date_from)
scraped_cargoes = [cargo for cargo in scraped_cargoes if not cargo.is_deleted]
For better visualization, it's convenient to insert data into a DataFrame
scraped_cargoes_data_frame = pd.DataFrame(scraped_cargoes)
scraped_cargoes_data_frame.head()
cargo_id | message_id | external_message_id | parsed_part_id | line_from | line_to | in_line_order | source | updated_date | received_date | is_deleted | low_confidence | scraped_laycan | laycan_from | laycan_to | scraped_load | load_geo_id | load_name | load_taxonomy_id | load_taxonomy | scraped_load2 | load_geo_id2 | load_name2 | load_taxonomy_id2 | load_taxonomy2 | scraped_discharge | scraped_discharge_options | discharge_geo_id | discharge_name | discharge_taxonomy_id | discharge_taxonomy | scraped_discharge2 | discharge_geo_id2 | discharge_name2 | discharge_taxonomy_id2 | discharge_taxonomy2 | scraped_charterer | charterer_id | charterer | scraped_cargo_type | cargo_type_id | cargo_type | cargo_type_group_id | cargo_type_group | scraped_quantity | quantity | quantity_buffer | quantity_from | quantity_to | size_from | size_to | scraped_delivery_date | delivery_date_from | delivery_date_to | scraped_delivery_from | delivery_from_geo_id | delivery_from_name | delivery_from_taxonomy_id | delivery_from_taxonomy | scraped_delivery_to | delivery_to_geo_id | delivery_to_name | delivery_to_taxonomy_id | delivery_to_taxonomy | scraped_redelivery_from | redelivery_from_geo_id | redelivery_from_name | redelivery_from_taxonomy_id | redelivery_from_taxonomy | scraped_redelivery_to | redelivery_to_geo_id | redelivery_to_name | redelivery_to_taxonomy_id | redelivery_to_taxonomy | charter_type_id | charter_type | cargo_status_id | cargo_status | content | subject | sender | is_private | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 53017905 | 84975073 | None | 85772118 | 54 | 54 | NaN | 2025-02-04 16:49:11+00:00 | 2025-02-04 16:48:33+00:00 | False | False | 15/02 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | arzew | 3358.0 | Arzew | 2.0 | Port | None | NaN | None | NaN | None | ukc | None | 25016.0 | UK Continent | 5.0 | Level1 | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | cond | 152000.0 | Condensate | 150000.0 | Mixed | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol cond 80 15/02 arzew/ukc | BRAEMAR AFRAMAX REPORT TUESDAY 04TH FEBRUARY 2025 | Braemar Shipping | True | |
1 | 53018220 | 84975533 | None | 85772488 | 42 | 42 | NaN | 2025-02-04 16:59:54+00:00 | 2025-02-04 16:59:09+00:00 | False | False | 15 feb | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | crude | 19.0 | Crude Oil | 130000.0 | Dirty | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 crude 15 feb algeria opts open | POTEN EUROPE AFRAMAX REPORT 4TH FEBRUARY 2025 | Poten | True | |
2 | 53018221 | 84975533 | None | 85772489 | 43 | 43 | NaN | 2025-02-04 16:59:54+00:00 | 2025-02-04 16:59:09+00:00 | False | False | 15 feb | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | crude | 19.0 | Crude Oil | 130000.0 | Dirty | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 crude 15 feb algeria opts open | POTEN EUROPE AFRAMAX REPORT 4TH FEBRUARY 2025 | Poten | True | |
3 | 53018651 | 84976083 | None | 85772927 | 29 | 29 | NaN | 2025-02-04 17:12:09+00:00 | 2025-02-04 17:10:15+00:00 | False | False | 17/feb | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | None | NaN | None | NaN | None | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 algeria/opts 17/feb | NOLARMA MED AFRAMAX REPORT | Nolarma | True | |
4 | 53018652 | 84976083 | None | 85772927 | 10 | 10 | NaN | 2025-02-04 17:12:09+00:00 | 2025-02-04 17:10:15+00:00 | False | True | None | NaT | NaT | None | NaN | None | NaN | None | None | NaN | None | NaN | None | None | None | NaN | None | NaN | None | None | NaN | None | NaN | None | None | NaN | None | None | NaN | None | NaN | None | None | NaN | NaN | NaN | NaN | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | fixing window 18 - 23 feb if bsea ckale canc 2... | NOLARMA MED AFRAMAX REPORT | Nolarma | True |
We perform a deduplication based on Quantity size, Delivery location, Delivery Date and Laycan Date.
Adding columns to DataFrame to condense information of interest
scraped_cargoes_data_frame["quantity_size_from"] = ""
scraped_cargoes_data_frame["quantity_size_to"] = ""
scraped_cargoes_data_frame["load_delivery_id"] = ""
scraped_cargoes_data_frame["load_delivery_name"] = ""
scraped_cargoes_data_frame["laycan_delivery_date_from"] = ""
scraped_cargoes_data_frame["laycan_delivery_date_to"] = ""
for ind in scraped_cargoes_data_frame.index:
CharterType = scraped_cargoes_data_frame.loc[ind, 'charter_type']
if not pd.isna(scraped_cargoes_data_frame.loc[ind, "quantity_from"]):
scraped_cargoes_data_frame.loc[ind, "quantity_size_from"] = scraped_cargoes_data_frame.loc[ind, "quantity_from"]
scraped_cargoes_data_frame.loc[ind, "quantity_size_to"] = scraped_cargoes_data_frame.loc[ind, "quantity_to"]
else:
scraped_cargoes_data_frame.loc[ind, "quantity_size_from"] = scraped_cargoes_data_frame.loc[ind, "size_from"]
scraped_cargoes_data_frame.loc[ind, "quantity_size_to"] = scraped_cargoes_data_frame.loc[ind, "size_to"]
if CharterType == 'Voyage':
scraped_cargoes_data_frame.loc[ind, "load_delivery_id"] = scraped_cargoes_data_frame.loc[ind, "load_geo_id"]
scraped_cargoes_data_frame.loc[ind, "load_delivery_name"] = scraped_cargoes_data_frame.loc[ind, "load_name"]
scraped_cargoes_data_frame.loc[ind, "laycan_delivery_date_from"] = scraped_cargoes_data_frame.loc[ind, "laycan_from"]
scraped_cargoes_data_frame.loc[ind, "laycan_delivery_date_to"] = scraped_cargoes_data_frame.loc[ind, "laycan_to"]
else:
scraped_cargoes_data_frame.loc[ind, "load_delivery_id"] = scraped_cargoes_data_frame.loc[ind, "delivery_from_geo_id"]
scraped_cargoes_data_frame.loc[ind, "load_delivery_name"] = scraped_cargoes_data_frame.loc[ind, "delivery_from_name"]
scraped_cargoes_data_frame.loc[ind, "laycan_delivery_date_from"] = scraped_cargoes_data_frame.loc[ind, "delivery_date_from"]
scraped_cargoes_data_frame.loc[ind, "laycan_delivery_date_to"] = scraped_cargoes_data_frame.loc[ind, "delivery_date_to"]
scraped_cargoes_data_frame["parent_cargo"] = 0
scraped_cargoes_data_frame.head()
cargo_id | message_id | external_message_id | parsed_part_id | line_from | line_to | in_line_order | source | updated_date | received_date | is_deleted | low_confidence | scraped_laycan | laycan_from | laycan_to | scraped_load | load_geo_id | load_name | load_taxonomy_id | load_taxonomy | scraped_load2 | load_geo_id2 | load_name2 | load_taxonomy_id2 | load_taxonomy2 | scraped_discharge | scraped_discharge_options | discharge_geo_id | discharge_name | discharge_taxonomy_id | discharge_taxonomy | scraped_discharge2 | discharge_geo_id2 | discharge_name2 | discharge_taxonomy_id2 | discharge_taxonomy2 | scraped_charterer | charterer_id | charterer | scraped_cargo_type | cargo_type_id | cargo_type | cargo_type_group_id | cargo_type_group | scraped_quantity | quantity | quantity_buffer | quantity_from | quantity_to | size_from | size_to | scraped_delivery_date | delivery_date_from | delivery_date_to | scraped_delivery_from | delivery_from_geo_id | delivery_from_name | delivery_from_taxonomy_id | delivery_from_taxonomy | scraped_delivery_to | delivery_to_geo_id | delivery_to_name | delivery_to_taxonomy_id | delivery_to_taxonomy | scraped_redelivery_from | redelivery_from_geo_id | redelivery_from_name | redelivery_from_taxonomy_id | redelivery_from_taxonomy | scraped_redelivery_to | redelivery_to_geo_id | redelivery_to_name | redelivery_to_taxonomy_id | redelivery_to_taxonomy | charter_type_id | charter_type | cargo_status_id | cargo_status | content | subject | sender | is_private | quantity_size_from | quantity_size_to | load_delivery_id | load_delivery_name | laycan_delivery_date_from | laycan_delivery_date_to | parent_cargo | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 53017905 | 84975073 | None | 85772118 | 54 | 54 | NaN | 2025-02-04 16:49:11+00:00 | 2025-02-04 16:48:33+00:00 | False | False | 15/02 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | arzew | 3358.0 | Arzew | 2.0 | Port | None | NaN | None | NaN | None | ukc | None | 25016.0 | UK Continent | 5.0 | Level1 | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | cond | 152000.0 | Condensate | 150000.0 | Mixed | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol cond 80 15/02 arzew/ukc | BRAEMAR AFRAMAX REPORT TUESDAY 04TH FEBRUARY 2025 | Braemar Shipping | True | 80000.0 | 80000.0 | 3358.0 | Arzew | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | 0 | |
1 | 53018220 | 84975533 | None | 85772488 | 42 | 42 | NaN | 2025-02-04 16:59:54+00:00 | 2025-02-04 16:59:09+00:00 | False | False | 15 feb | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | crude | 19.0 | Crude Oil | 130000.0 | Dirty | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 crude 15 feb algeria opts open | POTEN EUROPE AFRAMAX REPORT 4TH FEBRUARY 2025 | Poten | True | 80000.0 | 80000.0 | 69.0 | Algeria | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | 0 | |
2 | 53018221 | 84975533 | None | 85772489 | 43 | 43 | NaN | 2025-02-04 16:59:54+00:00 | 2025-02-04 16:59:09+00:00 | False | False | 15 feb | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | crude | 19.0 | Crude Oil | 130000.0 | Dirty | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 crude 15 feb algeria opts open | POTEN EUROPE AFRAMAX REPORT 4TH FEBRUARY 2025 | Poten | True | 80000.0 | 80000.0 | 69.0 | Algeria | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | 0 | |
3 | 53018651 | 84976083 | None | 85772927 | 29 | 29 | NaN | 2025-02-04 17:12:09+00:00 | 2025-02-04 17:10:15+00:00 | False | False | 17/feb | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | algeria | 69.0 | Algeria | 3.0 | Country | None | NaN | None | NaN | None | opts | None | NaN | None | NaN | None | None | NaN | None | NaN | None | vitol | 1831.0 | Vitol | None | NaN | None | NaN | None | 80 | 80000.0 | 0.0 | 80000.0 | 80000.0 | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | vitol 80 algeria/opts 17/feb | NOLARMA MED AFRAMAX REPORT | Nolarma | True | 80000.0 | 80000.0 | 69.0 | Algeria | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | 0 | |
4 | 53018652 | 84976083 | None | 85772927 | 10 | 10 | NaN | 2025-02-04 17:12:09+00:00 | 2025-02-04 17:10:15+00:00 | False | True | None | NaT | NaT | None | NaN | None | NaN | None | None | NaN | None | NaN | None | None | None | NaN | None | NaN | None | None | NaN | None | NaN | None | None | NaN | None | None | NaN | None | NaN | None | None | NaN | NaN | NaN | NaN | NaN | NaN | None | NaT | NaT | None | NaN | None | NaN | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | 0 | Voyage | NaN | None | fixing window 18 - 23 feb if bsea ckale canc 2... | NOLARMA MED AFRAMAX REPORT | Nolarma | True | NaN | NaN | NaN | None | NaT | NaT | 0 |
Drop Columns from DataFrame that are not needed
scraped_cargoes_data_frame_2 = scraped_cargoes_data_frame[['received_date','charterer','quantity_size_from','quantity_size_to','load_delivery_name','load_taxonomy','load_delivery_id','laycan_delivery_date_from','laycan_delivery_date_to','charter_type','parent_cargo']]
scraped_cargoes_data_frame_2.head(5)
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 16:48:33+00:00 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
1 | 2025-02-04 16:59:09+00:00 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
2 | 2025-02-04 16:59:09+00:00 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
3 | 2025-02-04 17:10:15+00:00 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 |
4 | 2025-02-04 17:10:15+00:00 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 |
Cargo count before deduplication
len(scraped_cargoes_data_frame_2.index)
11318
scraped_cargoes_data_frame_2.loc[:, 'received_date'] = scraped_cargoes_data_frame['received_date'].dt.date
scraped_cargoes_data_frame_2.head()
<ipython-input-10-2b0fcbbf30f5>:1: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[datetime.date(2025, 2, 4) datetime.date(2025, 2, 4) datetime.date(2025, 2, 4) ... datetime.date(2025, 2, 24) datetime.date(2025, 2, 24) datetime.date(2025, 2, 24)]' has dtype incompatible with datetime64[ns, UTC], please explicitly cast to a compatible dtype first. scraped_cargoes_data_frame_2.loc[:, 'received_date'] = scraped_cargoes_data_frame['received_date'].dt.date
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
1 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
2 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
3 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 |
4 | 2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 |
scraped_cargoes_data_frame_2.set_index('received_date').head(10)
charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | |
---|---|---|---|---|---|---|---|---|---|---|
received_date | ||||||||||
2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 |
2025-02-04 | Unipec | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-02-17 00:00:00+00:00 | 2025-02-21 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Unipec | 270000.0 | 270000.0 | Arabian Gulf | Level0 | 24777.0 | 2025-02-20 00:00:00+00:00 | 2025-02-25 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Occidental | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | NaT | NaT | Voyage | 0 |
2025-02-04 | Indian Oil | 260000.0 | 260000.0 | Basrah | Port | 3534.0 | 2025-02-19 00:00:00+00:00 | 2025-02-19 00:00:00+00:00 | Voyage | 0 |
2025-02-04 | Vitol | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-03-05 00:00:00+00:00 | 2025-03-10 00:00:00+00:00 | Voyage | 0 |
scraped_cargoes_data_frame_deduplicated=scraped_cargoes_data_frame_2.drop_duplicates()
scraped_cargoes_data_frame_deduplicated.head(10)
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
1 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 |
3 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 |
4 | 2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 |
5 | 2025-02-04 | Unipec | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-02-17 00:00:00+00:00 | 2025-02-21 00:00:00+00:00 | Voyage | 0 |
6 | 2025-02-04 | Unipec | 270000.0 | 270000.0 | Arabian Gulf | Level0 | 24777.0 | 2025-02-20 00:00:00+00:00 | 2025-02-25 00:00:00+00:00 | Voyage | 0 |
7 | 2025-02-04 | Occidental | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | NaT | NaT | Voyage | 0 |
8 | 2025-02-04 | Indian Oil | 260000.0 | 260000.0 | Basrah | Port | 3534.0 | 2025-02-19 00:00:00+00:00 | 2025-02-19 00:00:00+00:00 | Voyage | 0 |
9 | 2025-02-04 | Vitol | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-03-05 00:00:00+00:00 | 2025-03-10 00:00:00+00:00 | Voyage | 0 |
10 | 2025-02-04 | Resource Marine | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | NaT | NaT | Voyage | 0 |
Count of cargoes after deduplication
len(scraped_cargoes_data_frame_deduplicated.index)
4250
Calculation of cargoes recieved per day globally
date_counts = scraped_cargoes_data_frame_deduplicated.groupby('received_date').size().reset_index(name='count')
date_counts
received_date | count | |
---|---|---|
0 | 2025-02-04 | 93 |
1 | 2025-02-05 | 358 |
2 | 2025-02-06 | 340 |
3 | 2025-02-07 | 312 |
4 | 2025-02-10 | 268 |
5 | 2025-02-11 | 275 |
6 | 2025-02-12 | 274 |
7 | 2025-02-13 | 287 |
8 | 2025-02-14 | 318 |
9 | 2025-02-17 | 299 |
10 | 2025-02-18 | 315 |
11 | 2025-02-19 | 301 |
12 | 2025-02-20 | 284 |
13 | 2025-02-21 | 291 |
14 | 2025-02-24 | 235 |
distinct_taxonomies = scraped_cargoes_data_frame['load_taxonomy'].unique()
distinct_taxonomies
array(['Port', 'Country', None, 'Level0', 'GeoAsset', 'Unknown', 'Level1', 'Level2', 'Level3'], dtype=object)
Geos API: Retrieve All ports with AreaLevel0,AreaLevel1,AreaLevel2 Names¶
Scraped cargoes come with different location names, ranging from Port to Area Level 2.
In order to make our analysis, we have to normalize the Area Level to all cargoes retrieved. Thus, we use the Geos API to have a look-up table of the ports and their correspondent areas.
You may also find more on our Geos API here, including documentation of the object methods used and more examples.
all_areas = geos_api.get_areas()
df_areas = pd.DataFrame([a.__dict__ for a in all_areas])
all_ports = geos_api.get_ports()
df_ports = pd.DataFrame([a.__dict__ for a in all_ports])
all_countries= geos_api.get_ports()
df_countries = pd.DataFrame([a.__dict__ for a in all_countries])
all_geoAssets = geos_api.get_geoAssets()
df_geoAssets = pd.DataFrame([a.__dict__ for a in all_geoAssets])
df_areas_all = df_areas[df_areas['location_taxonomy_id'] == 7].merge(df_areas,how = 'left',left_on = 'area_id',right_on = 'parent_area_id',suffixes = ['_level3','_level2'])[['area_id_level3','area_name_level3','area_id_level2','area_name_level2']]\
.merge(df_areas,how = 'left',left_on = 'area_id_level2',right_on = 'parent_area_id')[['area_id_level3','area_name_level3','area_id_level2','area_name_level2','area_id','area_name']]\
.merge(df_areas,how = 'left',left_on = 'area_id',right_on = 'parent_area_id',suffixes = ['_level1','_level0'])[['area_id_level3','area_name_level3','area_id_level2','area_name_level2','area_id_level1','area_name_level1','area_id_level0','area_name_level0']]\
# DataFrame with Area3,Area2,Area1,Area0 of ports
df_areas_ports_all = df_areas_all.merge(df_ports,how = 'right',left_on = ['area_id_level2','area_id_level1','area_id_level0'], right_on = ['area_id_level2','area_id_level1','area_id_level0'],suffixes = ['_prev', '_port'])
df_areas_ports_all = df_areas_ports_all[['area_id_level2','area_name_level2_prev','area_id_level1','area_name_level1_prev','area_id_level0', 'area_name_level0_prev','port_id','port_name']]
df_areas_ports_all = df_areas_ports_all.loc[df_areas_ports_all['area_id_level1'] > 0].loc[df_areas_ports_all['port_id'] > 0]
# data frame with country, Areas level 1 of countries
df_country_arealevel1 = df_countries.merge(df_ports,how = 'left',left_on = 'country_id',right_on = 'country_id',suffixes = ['_country',None])[['port_id','port_name','country_id','country_name_country','area_id_level1']]\
.merge(df_areas.loc[df_areas['location_taxonomy_id'] == 5],how = 'left',left_on = 'area_id_level1',right_on = 'area_id',suffixes = (None, '_area'))[['country_id','country_name_country','area_id', 'area_name']]
df_country_areaslevel1 = df_country_arealevel1.copy()
df_country_areaslevel1 = df_country_areaslevel1[['country_id','country_name_country']].drop_duplicates()
df_country_areaslevel1['Areas_Level_1_In_Country'] = None
df_country_areaslevel1 = df_country_areaslevel1.reset_index(drop=True)
for idx, country in df_country_areaslevel1.iterrows():
country_name = country['country_name_country']
df_country_areaslevel1.at[idx,'Areas_Level_1_In_Country'] = df_country_arealevel1.loc[df_country_arealevel1['country_name_country'] == country_name ]['area_name'].unique()
df_country_areaslevel1
country_id | country_name_country | Areas_Level_1_In_Country | |
---|---|---|---|
0 | -104 | Korea | Korea / Japan |
1 | -103 | Arabian Gulf | Arabian Gulf |
2 | -102 | Caribs | Caribs |
3 | -101 | East Mediterranean | Mediterranean |
4 | -2 | Not set | Unknown |
... | ... | ... | ... |
191 | 234 | Tuvalu | Pacific Islands |
192 | 33 | Saint Barthelemy | Caribs |
193 | 146 | Moldova, Republic of | Black Sea / Sea Of Marmara |
194 | 61 | Christmas Island | South East Asia |
195 | 216 | Sao Tome and Principe | West Africa |
196 rows × 3 columns
Normalize cargo locations to Area Level 1¶
#TableOut = scraped_cargoes_data_frame_deduplicated.copy()
scraped_cargoes_data_frame_deduplicated['level1_area_load_delivery'] = None
<ipython-input-23-6e67d97afaf9>:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy scraped_cargoes_data_frame_deduplicated['level1_area_load_delivery'] = None
scraped_cargoes_data_frame_deduplicated.head(5)
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | level1_area_load_delivery | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | None |
1 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | None |
3 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 | None |
4 | 2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 | None |
5 | 2025-02-04 | Unipec | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-02-17 00:00:00+00:00 | 2025-02-21 00:00:00+00:00 | Voyage | 0 | None |
In the below codeblock, we match cargoes that have a Level0 location to its correspondent Level 1 Area.
load_taxonomy_valid = ['GeoAsset', 'Port', 'Level0', 'Country', 'Level1']
load_taxonomy_invalid = ['Unknown','Level2']
for idx, row in scraped_cargoes_data_frame_deduplicated.iterrows():
taxonomy = row['load_taxonomy']
if taxonomy in load_taxonomy_invalid:
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery'] = "N/A"
#Load Taxonomy = AreaLevel1
if taxonomy == 'Level1':
area_name = row['load_delivery_name']
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery'] = area_name
#Load Taxonomy = AreaLevel0
if taxonomy == 'Level0':
area_level_0_id = row['load_delivery_id']
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery']=df_areas_all.loc[df_areas_all['area_id_level0'] == area_level_0_id]['area_name_level1'].values[0]
#Load Taxonomy = Country
if taxonomy == 'Country':
country_id= row['load_delivery_id']
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery']=df_country_areaslevel1.loc[df_country_areaslevel1['country_id'] == country_id]['Areas_Level_1_In_Country'].values
#Load Taxonomy = Port
if taxonomy == 'Port':
port_id = row['load_delivery_id']
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery']= df_areas_ports_all.loc[df_areas_ports_all['port_id'] == port_id]['area_name_level1_prev'].values[0]
#Load Taxonomy = GeoAsset
if taxonomy == 'GeoAsset':
geoasset_id = row['load_delivery_id']
scraped_cargoes_data_frame_deduplicated.at[idx,'level1_area_load_delivery']= df_geoAssets.loc[df_geoAssets['geo_asset_id'] == geoasset_id]['area_name_level1'].values
scraped_cargoes_data_frame_deduplicated.reset_index(drop = True)
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | level1_area_load_delivery | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | Mediterranean |
1 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | Mediterranean |
2 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 | Mediterranean |
3 | 2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 | None |
4 | 2025-02-04 | Unipec | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-02-17 00:00:00+00:00 | 2025-02-21 00:00:00+00:00 | Voyage | 0 | US Gulf & Mainland |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4245 | 2025-02-24 | Unipec | 70000.0 | 70000.0 | US Gulf | Level0 | 24676.0 | 2025-03-06 00:00:00+00:00 | 2025-03-06 00:00:00+00:00 | Voyage | 0 | US Gulf & Mainland |
4246 | 2025-02-24 | ENI | 70000.0 | 70000.0 | East Coast Mexico | Level0 | 24671.0 | 2025-03-03 00:00:00+00:00 | 2025-03-03 00:00:00+00:00 | Voyage | 0 | East Coast Mexico |
4247 | 2025-02-24 | BP | 80000.0 | 80000.0 | Mongstad | Port | 3697.0 | 2025-02-28 00:00:00+00:00 | 2025-03-04 00:00:00+00:00 | Voyage | 0 | North Sea |
4248 | 2025-02-24 | Rosneft | 80000.0 | 80000.0 | Southwold | Port | 3434.0 | 2025-03-08 00:00:00+00:00 | 2025-03-09 00:00:00+00:00 | Voyage | 0 | UK Continent |
4249 | 2025-02-24 | Phillips 66 | 80000.0 | 80000.0 | Skikda | Port | 3360.0 | 2025-03-07 00:00:00+00:00 | 2025-03-08 00:00:00+00:00 | Voyage | 0 | Mediterranean |
4250 rows × 12 columns
Prepare final table and Graph¶
Find the Cargoes that correspond to the specific Level 1 Area
timeseries_table = scraped_cargoes_data_frame_deduplicated.copy()
matching_rows = timeseries_table['level1_area_load_delivery'].apply(
lambda x: Area_level_1 == x if isinstance(x, str) else Area_level_1 in x if isinstance(x, list) else False) # Some countries belong to multiple level 1 areas.
timeseries_table['matching_rows'] = matching_rows
timeseries_table.head(5)
received_date | charterer | quantity_size_from | quantity_size_to | load_delivery_name | load_taxonomy | load_delivery_id | laycan_delivery_date_from | laycan_delivery_date_to | charter_type | parent_cargo | level1_area_load_delivery | matching_rows | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Arzew | Port | 3358.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | Mediterranean | False |
1 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-15 00:00:00+00:00 | 2025-02-15 00:00:00+00:00 | Voyage | 0 | Mediterranean | False |
3 | 2025-02-04 | Vitol | 80000.0 | 80000.0 | Algeria | Country | 69.0 | 2025-02-17 00:00:00+00:00 | 2025-02-17 00:00:00+00:00 | Voyage | 0 | Mediterranean | False |
4 | 2025-02-04 | None | NaN | NaN | None | None | NaN | NaT | NaT | Voyage | 0 | None | False |
5 | 2025-02-04 | Unipec | 270000.0 | 270000.0 | US Gulf | Level0 | 24676.0 | 2025-02-17 00:00:00+00:00 | 2025-02-21 00:00:00+00:00 | Voyage | 0 | US Gulf & Mainland | True |
Daily Calculation of public cargoes
date_counts = timeseries_table.loc[timeseries_table['matching_rows'] == True].groupby('received_date').size().reset_index(name='count') # Make the cargoes recieved per day globally
date_counts # We count the cargoes that were public for every day in our timeframe
received_date | count | |
---|---|---|
0 | 2025-02-04 | 16 |
1 | 2025-02-05 | 27 |
2 | 2025-02-06 | 23 |
3 | 2025-02-07 | 18 |
4 | 2025-02-10 | 16 |
5 | 2025-02-11 | 12 |
6 | 2025-02-12 | 13 |
7 | 2025-02-13 | 9 |
8 | 2025-02-14 | 17 |
9 | 2025-02-17 | 14 |
10 | 2025-02-18 | 12 |
11 | 2025-02-19 | 10 |
12 | 2025-02-20 | 19 |
13 | 2025-02-21 | 19 |
14 | 2025-02-24 | 11 |
Output¶
Chart Creation, addiing chart details
plt.figure(figsize=(12, 6))
plt.plot(date_counts["received_date"], date_counts["count"], marker="o", linestyle="-", color="b", label="Cargoes")
plt.title("Cargo time Series Visualization in " + Area_level_1 )
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True)
plt.legend()
plt.xticks(date_counts["received_date"], rotation=45)
plt.tight_layout()
plt.show()