Get your personal Signal Ocean API subscription key (acquired here) and replace it below:
signal_ocean_api_key = '' # Replace with your subscription key
Scraped Cargoes API¶
The goal of Scraped Cargoes API is to collect and return scraped cargoes by the given filters. This can be done by using the ScrapedCargoesAPI class and calling appropriate methods
1. Request by filters¶
Cargoes can be retrieved for specific filters, by calling the get_cargoes method with the following arguments:
Required¶
vessel_type The vessel type
Additionally, at least one of the following is required
cargo_ids List of CargoIDs
message_ids List of MessageIDs
external_message_ids List of ExternalMessageIDs
received_date_from Earliest date the cargo received
received_date_to Latest date the cargo received
updated_date_from Earliest date the cargo updated
updated_date_to Latest date the cargo updated
Mixing received and updated dates is not allowed
It's highly recommended to use UTC dates, since this is the internally used format
2. Request by page token¶
Cargoes can also be retrieved, by calling the get_cargoes_incremental method.
This method will also return next_request_token, which the user should use to their next incremental request to retrieve only the latest updated cargoes.
Required¶
vessel_type The vessel type
Optional¶
page_token The key that should be used as a parameter of the token to retrieve the relevant page.
3. Retrieve page token¶
The token referenced above can be retrieved for a specific date, by calling the get_cargoes_incremental_token method with the following argument:
Required¶
updated_date_from Earliest date the cargo updated
Additional optional arguments¶
Methods get_cargoes and get_cargoes_incremental, also accept the following optional arguments:
include_details If this field is True the following columns will be included in the response (otherwise they will be None):
parsed_part_id, line_from, line_to, in_line_order, source
include_scraped_fields If this field is True the following columns will be included in the response (otherwise they will be None):
scraped_laycan, scraped_load, scraped_load2, scraped_discharge, scraped_discharge_options, scraped_discharge2, scraped_charterer, scraped_cargo_type, scraped_quantity, scraped_delivery_date, scraped_delivery_from, scraped_delivery_to,
scraped_redelivery_from, scraped_redelivery_to
include_labels If this field is True the following columns will be included in the response (otherwise they will be None):
load_name, load_taxonomy, load_name2, load_taxonomy2, discharge_name, discharge_taxonomy, discharge_name2, discharge_taxonomy2, charterer, cargo_type, cargo_type_group, delivery_from_name, delivery_from_taxonomy, delivery_to_name, delivery_to_taxonomy, redelivery_from_name, redelivery_from_taxonomy, redelivery_to_name, redelivery_to_taxonomy, charter_type, cargo_status
include_content If this field is True the following columns will be included in the response (otherwise they will be None):
content
include_sender If this field is True the following columns will be included in the response (otherwise they will be None):
sender
include_debug_info If this field is True the following columns will be included in the response (otherwise they will be None):
is_private
Default value is
Truefor the arguments described above
Installation¶
To install Signal Ocean SDK, simply run the following command
%%capture
%pip install signal-ocean
Quickstart¶
Import signal-ocean and other modules required for this demo
from signal_ocean import Connection
from signal_ocean.scraped_cargoes import ScrapedCargoesAPI, ScrapedCargo
from datetime import datetime, timedelta
import pandas as pd
import plotly.graph_objects as go
Create a new instance of the ScrapedCargoesAPI class
connection = Connection(signal_ocean_api_key)
api = ScrapedCargoesAPI(connection)
Now you are ready to retrieve your data
Request by date¶
To get all tanker cargoes received the last 4 days, you must declare appropriate vessel_type and received_date_from variables
vessel_type = 1 # Tanker
received_date_from = datetime.utcnow() - timedelta(days=4)
And then call get_cargoes method, as below
scraped_cargoes = api.get_cargoes(
vessel_type=vessel_type,
received_date_from=received_date_from,
)
next(iter(scraped_cargoes), None)
ScrapedCargo(cargo_id=45442083, message_id=69570386, external_message_id=None, parsed_part_id=74686406, line_from=80, line_to=80, in_line_order=None, source='Email', updated_date=datetime.datetime(2024, 7, 29, 9, 16, 43, tzinfo=datetime.timezone.utc), received_date=datetime.datetime(2024, 7, 29, 9, 15, 42, tzinfo=datetime.timezone.utc), is_deleted=False, low_confidence=False, scraped_laycan='07-08/aug', laycan_from=datetime.datetime(2024, 8, 7, 0, 0, tzinfo=datetime.timezone.utc), laycan_to=datetime.datetime(2024, 8, 8, 0, 0, tzinfo=datetime.timezone.utc), scraped_load='ag', load_geo_id=24777, load_name='Arabian Gulf', load_taxonomy_id=4, load_taxonomy='Level0', scraped_load2=None, load_geo_id2=None, load_name2=None, load_taxonomy_id2=None, load_taxonomy2=None, scraped_discharge='japan', scraped_discharge_options=None, discharge_geo_id=121, discharge_name='Japan', discharge_taxonomy_id=3, discharge_taxonomy='Country', scraped_discharge2=None, discharge_geo_id2=None, discharge_name2=None, discharge_taxonomy_id2=None, discharge_taxonomy2=None, scraped_charterer='shell', charterer_id=1482, charterer='Shell', scraped_cargo_type=None, cargo_type_id=None, cargo_type=None, cargo_type_group_id=None, cargo_type_group=None, scraped_quantity='75', quantity=75000.0, quantity_buffer=0.0, quantity_from=75000.0, quantity_to=75000.0, size_from=None, size_to=None, scraped_delivery_date=None, delivery_date_from=None, delivery_date_to=None, scraped_delivery_from=None, delivery_from_geo_id=None, delivery_from_name=None, delivery_from_taxonomy_id=None, delivery_from_taxonomy=None, scraped_delivery_to=None, delivery_to_geo_id=None, delivery_to_name=None, delivery_to_taxonomy_id=None, delivery_to_taxonomy=None, scraped_redelivery_from=None, redelivery_from_geo_id=None, redelivery_from_name=None, redelivery_from_taxonomy_id=None, redelivery_from_taxonomy=None, scraped_redelivery_to=None, redelivery_to_geo_id=None, redelivery_to_name=None, redelivery_to_taxonomy_id=None, redelivery_to_taxonomy=None, charter_type_id=0, charter_type='Voyage', cargo_status_id=None, cargo_status=None, content='07-08/aug shell ag japan 75 naf o/p proteus harvonne hld', subject='McQuilling Asia - Clean East Daily Market Report (29 Jul 2024)', sender='McQuilling', is_private=True)
For better visualization, it's convenient to insert data into a DataFrame
df = pd.DataFrame(scraped_cargoes)
df.head()
| cargo_id | message_id | external_message_id | parsed_part_id | line_from | line_to | in_line_order | source | updated_date | received_date | ... | redelivery_to_taxonomy_id | redelivery_to_taxonomy | charter_type_id | charter_type | cargo_status_id | cargo_status | content | subject | sender | is_private | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 45442083 | 69570386 | None | 74686406 | 80 | 80 | NaN | 2024-07-29 09:16:43+00:00 | 2024-07-29 09:15:42+00:00 | ... | NaN | None | 0 | Voyage | NaN | None | 07-08/aug shell ag japan 75 naf o/p proteus ha... | McQuilling Asia - Clean East Daily Market Repo... | McQuilling | True | |
| 1 | 45442084 | 69570386 | None | 74686406 | 58 | 58 | 1.0 | 2024-07-29 09:16:43+00:00 | 2024-07-29 09:15:42+00:00 | ... | NaN | None | 0 | Voyage | 0.0 | OnSubs | 01-02/aug ats taiwan spore-ag 60 um s 850k-14m... | McQuilling Asia - Clean East Daily Market Repo... | McQuilling | True | |
| 2 | 45442085 | 69570386 | None | 74686406 | 51 | 51 | NaN | 2024-07-29 09:16:43+00:00 | 2024-07-29 09:15:42+00:00 | ... | NaN | None | 0 | Voyage | 0.0 | OnSubs | 5-aug bf muara spore 35 cpf 285k daian subs | McQuilling Asia - Clean East Daily Market Repo... | McQuilling | True | |
| 3 | 45442086 | 69570386 | None | 74686406 | 41 | 41 | NaN | 2024-07-29 09:16:43+00:00 | 2024-07-29 09:15:42+00:00 | ... | NaN | None | 0 | Voyage | 0.0 | OnSubs | 03-05/aug marubeni korea japan 35 um s rnr ten... | McQuilling Asia - Clean East Daily Market Repo... | McQuilling | True | |
| 4 | 45442087 | 69570386 | None | 74686406 | 58 | 58 | NaN | 2024-07-29 09:16:43+00:00 | 2024-07-29 09:15:42+00:00 | ... | NaN | None | 0 | Voyage | 0.0 | OnSubs | 01-02/aug ats taiwan spore-ag 60 um s 850k-14m... | McQuilling Asia - Clean East Daily Market Repo... | McQuilling | True |
5 rows × 82 columns
Request by Message or ExternalMessage IDs¶
To retrieve cargoes for particular message ID(s), you should include an extra parameter called message_ids when using the get_cargoes method. This parameter should contain a list of message IDs. For instance,
message_ids = [69320619,69201092,69216273,69274650,69314539]
scraped_cargoes_by_message_ids = api.get_cargoes(
vessel_type=vessel_type,
message_ids=message_ids,
)
next(iter(scraped_cargoes_by_message_ids), None)
ScrapedCargo(cargo_id=45279461, message_id=69201092, external_message_id=None, parsed_part_id=74439682, line_from=71, line_to=71, in_line_order=None, source='Email', updated_date=datetime.datetime(2024, 7, 24, 4, 32, 16, tzinfo=datetime.timezone.utc), received_date=datetime.datetime(2024, 7, 24, 4, 30, tzinfo=datetime.timezone.utc), is_deleted=False, low_confidence=False, scraped_laycan='26-27 aug', laycan_from=datetime.datetime(2024, 8, 26, 0, 0, tzinfo=datetime.timezone.utc), laycan_to=datetime.datetime(2024, 8, 27, 0, 0, tzinfo=datetime.timezone.utc), scraped_load='balikpapan', load_geo_id=3486, load_name='Balikpapan', load_taxonomy_id=2, load_taxonomy='Port', scraped_load2=None, load_geo_id2=None, load_name2=None, load_taxonomy_id2=None, load_taxonomy2=None, scraped_discharge='opts', scraped_discharge_options=None, discharge_geo_id=None, discharge_name=None, discharge_taxonomy_id=None, discharge_taxonomy=None, scraped_discharge2=None, discharge_geo_id2=None, discharge_name2=None, discharge_taxonomy_id2=None, discharge_taxonomy2=None, scraped_charterer='tender', charterer_id=None, charterer=None, scraped_cargo_type=None, cargo_type_id=None, cargo_type=None, cargo_type_group_id=None, cargo_type_group=None, scraped_quantity=None, quantity=None, quantity_buffer=None, quantity_from=None, quantity_to=None, size_from=None, size_to=None, scraped_delivery_date=None, delivery_date_from=None, delivery_date_to=None, scraped_delivery_from=None, delivery_from_geo_id=None, delivery_from_name=None, delivery_from_taxonomy_id=None, delivery_from_taxonomy=None, scraped_delivery_to=None, delivery_to_geo_id=None, delivery_to_name=None, delivery_to_taxonomy_id=None, delivery_to_taxonomy=None, scraped_redelivery_from=None, redelivery_from_geo_id=None, redelivery_from_name=None, redelivery_from_taxonomy_id=None, redelivery_from_taxonomy=None, scraped_redelivery_to=None, redelivery_to_geo_id=None, redelivery_to_name=None, redelivery_to_taxonomy_id=None, redelivery_to_taxonomy=None, charter_type_id=0, charter_type='Voyage', cargo_status_id=None, cargo_status=None, content='tender 200kb v1250 balikpapan opts 26-27 aug clsd', subject='RAFFLES MARKET REPORT 24TH JUL 2024', sender='Raffles Shipbrokers', is_private=True)
You can achieve a similar result for external message IDs by providing an argument called external_message_ids.
Request by Cargo IDs¶
To retrieve cargoes for particular cargo ID(s), you should include an extra parameter called cargo_ids when using the get_cargoes method.
This parameter should contain a list of cargo IDs. For instance,
cargo_ids = [45336637,45336638,45336639,45336640,45336641] # Or add a list of your desired cargo IDs
scraped_cargoes_by_ids = api.get_cargoes(
vessel_type=vessel_type,
cargo_ids=cargo_ids,
)
df_by_ids = pd.DataFrame(scraped_cargoes_by_ids)
df_by_ids.head()
| cargo_id | message_id | external_message_id | parsed_part_id | line_from | line_to | in_line_order | source | updated_date | received_date | ... | redelivery_to_taxonomy_id | redelivery_to_taxonomy | charter_type_id | charter_type | cargo_status_id | cargo_status | content | subject | sender | is_private | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 45336637 | 69320619 | None | 74519826 | 124 | 124 | None | 2024-07-25 05:33:57+00:00 | 2024-07-25 05:30:18+00:00 | ... | None | None | 0 | Voyage | None | None | tender 200 kb mfo dumai opts 23-24 aug clsd | SHIPLINKS EAST OF SUEZ DIRTY MARKET REPORT (25... | ShipLinks | True | |
| 1 | 45336638 | 69320619 | None | 74519826 | 119 | 119 | None | 2024-07-25 05:33:57+00:00 | 2024-07-25 05:30:18+00:00 | ... | None | None | 0 | Voyage | None | None | tender 200 kb v1250 sungai pakning opts 02-03 ... | SHIPLINKS EAST OF SUEZ DIRTY MARKET REPORT (25... | ShipLinks | True | |
| 2 | 45336639 | 69320619 | None | 74519826 | 120 | 120 | None | 2024-07-25 05:33:57+00:00 | 2024-07-25 05:30:18+00:00 | ... | None | None | 0 | Voyage | None | None | tender 200 kb v1250 sungai pakning opts 13-14 ... | SHIPLINKS EAST OF SUEZ DIRTY MARKET REPORT (25... | ShipLinks | True | |
| 3 | 45336640 | 69320619 | None | 74519825 | 103 | 103 | None | 2024-07-25 05:33:57+00:00 | 2024-07-25 05:30:18+00:00 | ... | None | None | 0 | Voyage | None | None | tender 200 kb mfo cilacap opts 21-22 aug clsd | SHIPLINKS EAST OF SUEZ DIRTY MARKET REPORT (25... | ShipLinks | True | |
| 4 | 45336641 | 69320619 | None | 74519825 | 102 | 102 | None | 2024-07-25 05:33:57+00:00 | 2024-07-25 05:30:18+00:00 | ... | None | None | 0 | Voyage | None | None | tender 200 kb v1250 sungai pakning opts 26-27 ... | SHIPLINKS EAST OF SUEZ DIRTY MARKET REPORT (25... | ShipLinks | True |
5 rows × 82 columns
Retrieve incremental token¶
The get_cargoes_incremental_token allows retrieval of page_token by adding the updated_date_from value.
This page_token can be used as a variable to the get_cargoes_incremental method, in order to retrieve only new or updated cargoes.
For instance, to retrieve the relevant page token for last week's data, you must call the get_cargoes_incremental_token, as below
updated_date_from = datetime.utcnow() - timedelta(days=7)
page_token = api.get_cargoes_incremental_token(
updated_date_from=updated_date_from,
)
page_token
'MjAyNC0wNy0yNlQwODo0NzoyMy4wMDBa'
Request by page token¶
If the page_token parameter is not specified, the get_cargoes_incremental method will retrieve all cargoes. You can call the get_cargoes_incremental method, as below
scraped_cargoes_incremental = api.get_cargoes_incremental(
vessel_type=vessel_type,
page_token=page_token,
)
df_by_ids = pd.DataFrame(scraped_cargoes_incremental.data)
df_by_ids.head()
| cargo_id | message_id | external_message_id | parsed_part_id | line_from | line_to | in_line_order | source | updated_date | received_date | ... | redelivery_to_taxonomy_id | redelivery_to_taxonomy | charter_type_id | charter_type | cargo_status_id | cargo_status | content | subject | sender | is_private | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 45396552 | 69453094 | None | 74606440.0 | 41.0 | 41.0 | NaN | 2024-07-26 09:10:02+00:00 | 2024-07-26 09:08:58+00:00 | ... | NaN | None | 0.0 | Voyage | NaN | None | repsol 130 12-aug nhc sankofa spain 90ws popi p - | DA' OCEANIC SHIPPING - DAILY DPP MARKET REPORT... | Da'oceanic Shipping | True | |
| 1 | 45397078 | 69454207 | None | 74607105.0 | 203.0 | 203.0 | NaN | 2024-07-26 09:20:30+00:00 | 2024-07-26 09:19:24+00:00 | ... | NaN | None | 0.0 | Voyage | NaN | None | equinor mongstad trans-atlantic 210ws green sky - | DA' OCEANIC SHIPPING - WEEKLY CPP MARKET REPOR... | Da'oceanic Shipping | True | |
| 2 | 45397079 | 69454207 | None | 74607105.0 | 91.0 | 91.0 | NaN | 2024-07-26 09:20:30+00:00 | 2024-07-26 09:19:24+00:00 | ... | NaN | None | 0.0 | Voyage | NaN | None | kpc kuwait pakistan $595k em zenith - | DA' OCEANIC SHIPPING - WEEKLY CPP MARKET REPOR... | Da'oceanic Shipping | True | |
| 3 | 45397080 | 69454207 | None | 74607105.0 | 61.0 | 61.0 | 1.0 | 2024-07-26 09:20:30+00:00 | 2024-07-26 09:19:24+00:00 | ... | NaN | None | 0.0 | Voyage | NaN | None | reliance sikka ukc-usac $4.30-4.75m hafnia yar... | DA' OCEANIC SHIPPING - WEEKLY CPP MARKET REPOR... | Da'oceanic Shipping | True | |
| 4 | 45397081 | 69454207 | None | 74607105.0 | 226.0 | 226.0 | NaN | 2024-07-26 09:20:30+00:00 | 2024-07-26 09:19:24+00:00 | ... | NaN | None | 0.0 | Voyage | NaN | None | maersk usg opts own fpmc 36 - | DA' OCEANIC SHIPPING - WEEKLY CPP MARKET REPOR... | Da'oceanic Shipping | True |
5 rows × 82 columns
Also, the get_cargoes_incremental method will retrieve the next_request_token, which should be used as the page_token for the next incremental call and retrieve only new or updated cargoes.
next_request_token = scraped_cargoes_incremental.next_request_token
next_request_token
'MjAyNC0wOC0wMlQwODo0NzoyNC4xNDha'
Usage of optional arguments¶
By default, all fields are returned. In many cases, it is convenient to select specific columns. For example, if we want to compare scraped and mapped fields
scraped_mapped_columns = [
'scraped_charterer',
'charterer',
'scraped_quantity',
'quantity',
'scraped_load',
'load_name',
]
scraped_mapped_df = pd.DataFrame(scraped_cargoes, columns=scraped_mapped_columns)
scraped_mapped_df.head()
| scraped_charterer | charterer | scraped_quantity | quantity | scraped_load | load_name | |
|---|---|---|---|---|---|---|
| 0 | shell | Shell | 75 | 75000.0 | ag | Arabian Gulf |
| 1 | lr | None | 60 | 60000.0 | ats taiwan | Taiwan |
| 2 | None | None | 35 | 35000.0 | muara | Muara |
| 3 | None | None | 35 | 35000.0 | marubeni korea | Korea, Republic of |
| 4 | lr | None | 60 | 60000.0 | ats taiwan | Taiwan |
Examples¶
Let's start by fetching all tanker cargoes received the last 2 weeks
example_vessel_type = 1 # Tanker
example_date_from = datetime.utcnow() - timedelta(days=14)
example_scraped_cargoes = api.get_cargoes(
vessel_type=example_vessel_type,
received_date_from=example_date_from,
)
Exclude deleted scraped cargoes¶
The is_deleted property of a scraped cargo indicates whether it is valid or not. If it is set to True, the corresponding cargo_id has been replaced by a new one.
For the sake of completeness, we will exclude deleted scraped cargoes in the following examples
example_scraped_cargoes = [cargo for cargo in example_scraped_cargoes if not cargo.is_deleted]
next(iter(example_scraped_cargoes), None)
ScrapedCargo(cargo_id=45136695, message_id=68888739, external_message_id=None, parsed_part_id=74227216, line_from=26, line_to=26, in_line_order=None, source='Email', updated_date=datetime.datetime(2024, 7, 19, 9, 14, 26, tzinfo=datetime.timezone.utc), received_date=datetime.datetime(2024, 7, 19, 9, 9, 22, tzinfo=datetime.timezone.utc), is_deleted=False, low_confidence=False, scraped_laycan='20-25/8', laycan_from=datetime.datetime(2024, 8, 20, 0, 0, tzinfo=datetime.timezone.utc), laycan_to=datetime.datetime(2024, 8, 25, 0, 0, tzinfo=datetime.timezone.utc), scraped_load='usg', load_geo_id=24676, load_name='US Gulf', load_taxonomy_id=4, load_taxonomy='Level0', scraped_load2=None, load_geo_id2=None, load_name2=None, load_taxonomy_id2=None, load_taxonomy2=None, scraped_discharge='ulsan', scraped_discharge_options=None, discharge_geo_id=3602, discharge_name='Ulsan', discharge_taxonomy_id=2, discharge_taxonomy='Port', scraped_discharge2=None, discharge_geo_id2=None, discharge_name2=None, discharge_taxonomy_id2=None, discharge_taxonomy2=None, scraped_charterer='sk', charterer_id=1528, charterer='SK', scraped_cargo_type=None, cargo_type_id=None, cargo_type=None, cargo_type_group_id=None, cargo_type_group=None, scraped_quantity='270', quantity=270000.0, quantity_buffer=0.0, quantity_from=270000.0, quantity_to=270000.0, size_from=None, size_to=None, scraped_delivery_date=None, delivery_date_from=None, delivery_date_to=None, scraped_delivery_from=None, delivery_from_geo_id=None, delivery_from_name=None, delivery_from_taxonomy_id=None, delivery_from_taxonomy=None, scraped_delivery_to=None, delivery_to_geo_id=None, delivery_to_name=None, delivery_to_taxonomy_id=None, delivery_to_taxonomy=None, scraped_redelivery_from=None, redelivery_from_geo_id=None, redelivery_from_name=None, redelivery_from_taxonomy_id=None, redelivery_from_taxonomy=None, scraped_redelivery_to=None, redelivery_to_geo_id=None, redelivery_to_name=None, redelivery_to_taxonomy_id=None, redelivery_to_taxonomy=None, charter_type_id=0, charter_type='Voyage', cargo_status_id=None, cargo_status=None, content='sk 270 usg/ulsan 20-25/8 -rptd', subject='VLCC update', sender='Fearnleys', is_private=True)
Now, we are ready to insert our data into a dataframe and keep only specific fields
example_columns = [
'charterer',
'laycan_from',
'load_name',
'quantity',
'is_deleted',
]
data = pd.DataFrame(example_scraped_cargoes, columns=example_columns)
data.head()
| charterer | laycan_from | load_name | quantity | is_deleted | |
|---|---|---|---|---|---|
| 0 | SK | 2024-08-20 00:00:00+00:00 | US Gulf | 270000.0 | False |
| 1 | Equinor | 2024-08-20 00:00:00+00:00 | US Gulf | 270000.0 | False |
| 2 | Equinor | 2024-08-20 00:00:00+00:00 | US Gulf | 270000.0 | False |
| 3 | Petco Chartering | 2024-08-01 00:00:00+00:00 | Arabian Gulf | 270000.0 | False |
| 4 | BP | 2024-08-04 00:00:00+00:00 | Jebel Dhanna | 270000.0 | False |
Top 10 Charterers¶
In this example, we will find the top 10 Charterers, based on the number of distinct available cargoes
top_chrtr_ser = data[['charterer', 'laycan_from']].drop_duplicates().charterer.value_counts().head(10)
top_chrtr_df = top_chrtr_ser.to_frame(name='CargoCount').reset_index().rename(columns={'index': 'Charterer'})
top_chrtr_df
| charterer | CargoCount | |
|---|---|---|
| 0 | Aramco Trading Company | 30 |
| 1 | Vitol | 27 |
| 2 | Shell | 22 |
| 3 | Unipec | 21 |
| 4 | BP | 20 |
| 5 | Trafigura | 16 |
| 6 | CSSA | 16 |
| 7 | ExxonMobil | 15 |
| 8 | ADMIC | 15 |
| 9 | Indian Oil | 14 |
And display results in a bar plot
top_chrtr_fig = go.Figure()
bar = go.Bar(
x=top_chrtr_df.charterer.tolist(),
y=top_chrtr_df.CargoCount.tolist(),
)
top_chrtr_fig.add_trace(bar)
top_chrtr_fig.update_xaxes(title_text="Charterer")
top_chrtr_fig.update_yaxes(title_text="Number of available Cargoes")
top_chrtr_fig.show()
Total quantity to load in specific areas per day the next week¶
this_week_days = pd.date_range(start=datetime.utcnow().date(), freq='D', periods=7, tz='UTC')
areas = data[data.load_name.notna()].load_name.value_counts().head().index.tolist()
areas
['US Gulf', 'Arabian Gulf', 'Singapore', 'Mumbai', 'Sikka']
Create the pivot table
areas_mask = data.load_name.isin(areas) & data.laycan_from.isin(this_week_days)
df_areas = data[areas_mask]
df_pivot = pd.pivot_table(
df_areas,
columns='load_name',
index='laycan_from',
values='quantity',
aggfunc=pd.Series.sum,
fill_value=0,
).reindex(index=this_week_days, fill_value=0).reset_index().rename(columns={'index': 'laycan_from'})
df_pivot
| load_name | laycan_from | Arabian Gulf | Mumbai | Sikka | Singapore | US Gulf |
|---|---|---|---|---|---|---|
| 0 | 2024-08-02 00:00:00+00:00 | 2480000 | 0 | 0 | 240000 | 2014000 |
| 1 | 2024-08-03 00:00:00+00:00 | 1350000 | 0 | 960000 | 140000 | 3184000 |
| 2 | 2024-08-04 00:00:00+00:00 | 1585000 | 0 | 120000 | 3440000 | 76000 |
| 3 | 2024-08-05 00:00:00+00:00 | 1605000 | 1140000 | 0 | 1040000 | 1662000 |
| 4 | 2024-08-06 00:00:00+00:00 | 2865000 | 650000 | 70000 | 360000 | 1232000 |
| 5 | 2024-08-07 00:00:00+00:00 | 1165000 | 0 | 0 | 480000 | 4536000 |
| 6 | 2024-08-08 00:00:00+00:00 | 4310000 | 30000 | 210000 | 885960 | 178000 |
And display the results as timeseries
def area_button(area):
args = [
{'visible': [i == areas.index(area) for i in range(len(areas))]},
{
'title': f'Total Quantity to load in {area} per day',
'showlegend': True
},
]
return dict(
label=area,
method='update',
args=args,
)
title = 'Total Quantity to load per day'
today = datetime.combine(datetime.utcnow().date(), datetime.min.time())
areas_fig = go.Figure()
area_buttons = []
for area in areas:
if area not in df_pivot.columns:
continue
area_scatter_plot = go.Scatter(
x=df_pivot.laycan_from,
y=df_pivot[area],
name=area,
mode='lines',
)
areas_fig.add_trace(area_scatter_plot)
area_buttons.append(area_button(area))
buttons = list([
dict(
label='All',
method='update',
args=[
{'visible': [True for _ in range(len(areas))]},
{
'title': title,
'showlegend': True
}
],
),
*area_buttons,
])
areas_fig.update_layout(
title=title,
updatemenus=[go.layout.Updatemenu(
active=0,
buttons=buttons,
)],
xaxis_range=[today - timedelta(hours=4), today + timedelta(hours=24*6 + 4)],
)
areas_fig.show()
Export data to csv¶
output_path = '' # Change output_path with your path
filename = 'last_two_weeks_cargoes.csv'
if not data.empty:
data.to_csv(output_path+filename, index=False)