Subsetting Data#
Using existing xarray functions#
Tracks are loaded as an xarray.Dataset which have lots of built in methods for subsetting data. e.g. for indexing see xarray indexing.
For more specific selection of data, the best method is to use xarray.Dataset.where with the argument drop=True. e.g.
[1]:
import huracanpy
tracks = huracanpy.load(huracanpy.example_csv_file)
# Select all points with longitude > 60
print(tracks.lon, "\n")
tracks_subset = tracks.where(tracks.lon > 60, drop=True)
print(tracks_subset.lon)
<xarray.DataArray 'lon' (record: 99)>
array([120.5 , 119. , 119. , 119.25, 119.5 , 118.75, 118.5 , 118.25,
118.25, 118.25, 118.75, 119.25, 119.25, 119.75, 120. , 120. ,
119.5 , 119.25, 118.25, 117.5 , 117. , 117. , 116.75, 116.75,
117.5 , 119.25, 121. , 123.5 , 127.5 , 130.25, 131.25, 149.5 ,
151.5 , 154. , 156.25, 158.5 , 159.5 , 160. , 160. , 160. ,
158.25, 156. , 154.25, 153.25, 152.75, 152.5 , 152.5 , 153. ,
153.75, 154.75, 156. , 55.25, 54.25, 52.75, 54. , 55. ,
52. , 51. , 52. , 51.5 , 50.75, 50.25, 50.5 , 50.75,
49.75, 50. , 50.5 , 51. , 51.75, 52.25, 53. , 53.25,
52.75, 54.5 , 53. , 53.75, 53.5 , 53.25, 53.25, 52.75,
52.5 , 52.25, 52.5 , 53.25, 54.25, 55.5 , 56.75, 58.25,
59.5 , 59.25, 58.75, 58.25, 57.75, 57.5 , 57.25, 57.5 ,
58.5 , 60.25, 62.25])
Coordinates:
* record (record) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98
<xarray.DataArray 'lon' (record: 53)>
array([120.5 , 119. , 119. , 119.25, 119.5 , 118.75, 118.5 , 118.25,
118.25, 118.25, 118.75, 119.25, 119.25, 119.75, 120. , 120. ,
119.5 , 119.25, 118.25, 117.5 , 117. , 117. , 116.75, 116.75,
117.5 , 119.25, 121. , 123.5 , 127.5 , 130.25, 131.25, 149.5 ,
151.5 , 154. , 156.25, 158.5 , 159.5 , 160. , 160. , 160. ,
158.25, 156. , 154.25, 153.25, 152.75, 152.5 , 152.5 , 153. ,
153.75, 154.75, 156. , 60.25, 62.25])
Coordinates:
* record (record) int64 0 1 2 3 4 5 6 7 8 9 ... 44 45 46 47 48 49 50 97 98
Selecting times#
Generally the time array will be loaded in as an np.datetime64 array. This means it doesn’t work to compare it with the standard datetime
[2]:
import datetime
# Try to select a subset of times based on datetime
print(tracks.time)
tracks_subset = tracks.where(tracks.time > datetime.datetime(1980, 1, 10), drop=True)
<xarray.DataArray 'time' (record: 99)>
array(['1980-01-06T06:00:00.000000000', '1980-01-06T12:00:00.000000000',
'1980-01-06T18:00:00.000000000', '1980-01-07T00:00:00.000000000',
'1980-01-07T06:00:00.000000000', '1980-01-07T12:00:00.000000000',
'1980-01-07T18:00:00.000000000', '1980-01-08T00:00:00.000000000',
'1980-01-08T06:00:00.000000000', '1980-01-08T12:00:00.000000000',
'1980-01-08T18:00:00.000000000', '1980-01-09T00:00:00.000000000',
'1980-01-09T06:00:00.000000000', '1980-01-09T12:00:00.000000000',
'1980-01-09T18:00:00.000000000', '1980-01-10T00:00:00.000000000',
'1980-01-10T06:00:00.000000000', '1980-01-10T12:00:00.000000000',
'1980-01-10T18:00:00.000000000', '1980-01-11T00:00:00.000000000',
'1980-01-11T06:00:00.000000000', '1980-01-11T12:00:00.000000000',
'1980-01-11T18:00:00.000000000', '1980-01-12T00:00:00.000000000',
'1980-01-12T06:00:00.000000000', '1980-01-12T12:00:00.000000000',
'1980-01-12T18:00:00.000000000', '1980-01-13T00:00:00.000000000',
'1980-01-13T06:00:00.000000000', '1980-01-13T12:00:00.000000000',
'1980-01-13T18:00:00.000000000', '1980-01-07T00:00:00.000000000',
'1980-01-07T06:00:00.000000000', '1980-01-07T12:00:00.000000000',
'1980-01-07T18:00:00.000000000', '1980-01-08T00:00:00.000000000',
'1980-01-08T06:00:00.000000000', '1980-01-08T12:00:00.000000000',
'1980-01-08T18:00:00.000000000', '1980-01-09T06:00:00.000000000',
...
'1980-01-21T06:00:00.000000000', '1980-01-21T12:00:00.000000000',
'1980-01-21T18:00:00.000000000', '1980-01-22T00:00:00.000000000',
'1980-01-22T06:00:00.000000000', '1980-01-22T12:00:00.000000000',
'1980-01-22T18:00:00.000000000', '1980-01-23T00:00:00.000000000',
'1980-01-23T06:00:00.000000000', '1980-01-23T12:00:00.000000000',
'1980-01-23T18:00:00.000000000', '1980-01-24T00:00:00.000000000',
'1980-01-24T06:00:00.000000000', '1980-01-24T12:00:00.000000000',
'1980-01-24T18:00:00.000000000', '1980-01-25T00:00:00.000000000',
'1980-01-25T06:00:00.000000000', '1980-01-25T12:00:00.000000000',
'1980-01-25T18:00:00.000000000', '1980-01-26T00:00:00.000000000',
'1980-01-26T06:00:00.000000000', '1980-01-26T12:00:00.000000000',
'1980-01-26T18:00:00.000000000', '1980-01-27T00:00:00.000000000',
'1980-01-27T06:00:00.000000000', '1980-01-27T12:00:00.000000000',
'1980-01-27T18:00:00.000000000', '1980-01-28T00:00:00.000000000',
'1980-01-28T06:00:00.000000000', '1980-01-28T12:00:00.000000000',
'1980-01-28T18:00:00.000000000', '1980-01-29T00:00:00.000000000',
'1980-01-29T06:00:00.000000000', '1980-01-29T12:00:00.000000000',
'1980-01-29T18:00:00.000000000', '1980-01-30T00:00:00.000000000',
'1980-01-30T06:00:00.000000000', '1980-01-30T12:00:00.000000000',
'1980-01-30T18:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
* record (record) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[2], line 5
3 # Try to select a subset of times based on datetime
4 print(tracks.time)
----> 5 tracks_subset = tracks.where(tracks.time > datetime.datetime(1980, 1, 10), drop=True)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:287, in DataArrayOpsMixin.__gt__(self, other)
286 def __gt__(self, other: DaCompatible) -> Self:
--> 287 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/dataarray.py:4691, in DataArray._binary_op(self, other, f, reflexive)
4687 other_variable_or_arraylike: DaCompatible = getattr(other, "variable", other)
4688 other_coords = getattr(other, "coords", None)
4690 variable = (
-> 4691 f(self.variable, other_variable_or_arraylike)
4692 if not reflexive
4693 else f(other_variable_or_arraylike, self.variable)
4694 )
4695 coords, indexes = self.coords._merge_raw(other_coords, reflexive)
4696 name = self._result_name(other)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:619, in VariableOpsMixin.__gt__(self, other)
618 def __gt__(self, other: VarCompatible) -> Self | T_DataArray:
--> 619 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/variable.py:2411, in Variable._binary_op(self, other, f, reflexive)
2408 attrs = self._attrs if keep_attrs else None
2409 with np.errstate(all="ignore"):
2410 new_data = (
-> 2411 f(self_data, other_data) if not reflexive else f(other_data, self_data)
2412 )
2413 result = Variable(dims, new_data, attrs=attrs)
2414 return result
TypeError: '>' not supported between instances of 'int' and 'datetime.datetime'
However, the same comparison can be done using datetime64, the syntax is just a bit different
[3]:
import numpy as np
tracks_subset = tracks.where(tracks.time > np.datetime64("1980-01-10"), drop=True)
print(tracks_subset.time)
<xarray.DataArray 'time' (record: 72)>
array(['1980-01-10T06:00:00.000000000', '1980-01-10T12:00:00.000000000',
'1980-01-10T18:00:00.000000000', '1980-01-11T00:00:00.000000000',
'1980-01-11T06:00:00.000000000', '1980-01-11T12:00:00.000000000',
'1980-01-11T18:00:00.000000000', '1980-01-12T00:00:00.000000000',
'1980-01-12T06:00:00.000000000', '1980-01-12T12:00:00.000000000',
'1980-01-12T18:00:00.000000000', '1980-01-13T00:00:00.000000000',
'1980-01-13T06:00:00.000000000', '1980-01-13T12:00:00.000000000',
'1980-01-13T18:00:00.000000000', '1980-01-10T06:00:00.000000000',
'1980-01-10T12:00:00.000000000', '1980-01-10T18:00:00.000000000',
'1980-01-11T00:00:00.000000000', '1980-01-11T06:00:00.000000000',
'1980-01-11T12:00:00.000000000', '1980-01-11T18:00:00.000000000',
'1980-01-12T00:00:00.000000000', '1980-01-12T06:00:00.000000000',
'1980-01-17T06:00:00.000000000', '1980-01-17T18:00:00.000000000',
'1980-01-18T06:00:00.000000000', '1980-01-19T00:00:00.000000000',
'1980-01-19T06:00:00.000000000', '1980-01-20T06:00:00.000000000',
'1980-01-20T12:00:00.000000000', '1980-01-20T18:00:00.000000000',
'1980-01-21T00:00:00.000000000', '1980-01-21T06:00:00.000000000',
'1980-01-21T12:00:00.000000000', '1980-01-21T18:00:00.000000000',
'1980-01-22T00:00:00.000000000', '1980-01-22T06:00:00.000000000',
'1980-01-22T12:00:00.000000000', '1980-01-22T18:00:00.000000000',
'1980-01-23T00:00:00.000000000', '1980-01-23T06:00:00.000000000',
'1980-01-23T12:00:00.000000000', '1980-01-23T18:00:00.000000000',
'1980-01-24T00:00:00.000000000', '1980-01-24T06:00:00.000000000',
'1980-01-24T12:00:00.000000000', '1980-01-24T18:00:00.000000000',
'1980-01-25T00:00:00.000000000', '1980-01-25T06:00:00.000000000',
'1980-01-25T12:00:00.000000000', '1980-01-25T18:00:00.000000000',
'1980-01-26T00:00:00.000000000', '1980-01-26T06:00:00.000000000',
'1980-01-26T12:00:00.000000000', '1980-01-26T18:00:00.000000000',
'1980-01-27T00:00:00.000000000', '1980-01-27T06:00:00.000000000',
'1980-01-27T12:00:00.000000000', '1980-01-27T18:00:00.000000000',
'1980-01-28T00:00:00.000000000', '1980-01-28T06:00:00.000000000',
'1980-01-28T12:00:00.000000000', '1980-01-28T18:00:00.000000000',
'1980-01-29T00:00:00.000000000', '1980-01-29T06:00:00.000000000',
'1980-01-29T12:00:00.000000000', '1980-01-29T18:00:00.000000000',
'1980-01-30T00:00:00.000000000', '1980-01-30T06:00:00.000000000',
'1980-01-30T12:00:00.000000000', '1980-01-30T18:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* record (record) int64 16 17 18 19 20 21 22 23 ... 91 92 93 94 95 96 97 98
Note, that this isn’t always the case. If the tracks are loaded in with a different calendar, then the times will use cftime which is not converted to datetime64 by xarray.
[7]:
# The tracks don't actually use a 360_day calendar.
# I'm just passing this as an argument to show an example of it loading this way
tracks = huracanpy.load(
huracanpy.example_TRACK_file, tracker="track", calendar="360_day"
)
print(tracks.time)
<xarray.DataArray 'time' (record: 46)>
array([cftime.datetime(2022, 1, 13, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
...
cftime.datetime(2022, 1, 24, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 29, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 29, 6, 0, 0, 0, calendar='360_day', has_year_zero=True)],
dtype=object)
Dimensions without coordinates: record
In this case, neither the datetime or the datetime64 comparison will work and you have to compare to a cftime.datetime object with the same calendar
[9]:
tracks_subset = tracks.where(tracks.time > datetime.datetime(1980, 1, 10), drop=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 tracks_subset = tracks.where(tracks.time > datetime.datetime(1980, 1, 10), drop=True)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:287, in DataArrayOpsMixin.__gt__(self, other)
286 def __gt__(self, other: DaCompatible) -> Self:
--> 287 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/dataarray.py:4691, in DataArray._binary_op(self, other, f, reflexive)
4687 other_variable_or_arraylike: DaCompatible = getattr(other, "variable", other)
4688 other_coords = getattr(other, "coords", None)
4690 variable = (
-> 4691 f(self.variable, other_variable_or_arraylike)
4692 if not reflexive
4693 else f(other_variable_or_arraylike, self.variable)
4694 )
4695 coords, indexes = self.coords._merge_raw(other_coords, reflexive)
4696 name = self._result_name(other)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:619, in VariableOpsMixin.__gt__(self, other)
618 def __gt__(self, other: VarCompatible) -> Self | T_DataArray:
--> 619 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/variable.py:2411, in Variable._binary_op(self, other, f, reflexive)
2408 attrs = self._attrs if keep_attrs else None
2409 with np.errstate(all="ignore"):
2410 new_data = (
-> 2411 f(self_data, other_data) if not reflexive else f(other_data, self_data)
2412 )
2413 result = Variable(dims, new_data, attrs=attrs)
2414 return result
File src/cftime/_cftime.pyx:1432, in cftime._cftime.datetime.__richcmp__()
TypeError: cannot compare cftime.datetime(2022, 1, 13, 18, 0, 0, 0, calendar='360_day', has_year_zero=True) and datetime.datetime(1980, 1, 10, 0, 0) (different calendars)
[8]:
tracks_subset = tracks.where(tracks.time > np.datetime64("1980-01-10"), drop=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[8], line 1
----> 1 tracks_subset = tracks.where(tracks.time > np.datetime64("1980-01-10"), drop=True)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:287, in DataArrayOpsMixin.__gt__(self, other)
286 def __gt__(self, other: DaCompatible) -> Self:
--> 287 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/dataarray.py:4691, in DataArray._binary_op(self, other, f, reflexive)
4687 other_variable_or_arraylike: DaCompatible = getattr(other, "variable", other)
4688 other_coords = getattr(other, "coords", None)
4690 variable = (
-> 4691 f(self.variable, other_variable_or_arraylike)
4692 if not reflexive
4693 else f(other_variable_or_arraylike, self.variable)
4694 )
4695 coords, indexes = self.coords._merge_raw(other_coords, reflexive)
4696 name = self._result_name(other)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/_typed_ops.py:619, in VariableOpsMixin.__gt__(self, other)
618 def __gt__(self, other: VarCompatible) -> Self | T_DataArray:
--> 619 return self._binary_op(other, operator.gt)
File ~/miniforge3/envs/core/lib/python3.11/site-packages/xarray/core/variable.py:2411, in Variable._binary_op(self, other, f, reflexive)
2408 attrs = self._attrs if keep_attrs else None
2409 with np.errstate(all="ignore"):
2410 new_data = (
-> 2411 f(self_data, other_data) if not reflexive else f(other_data, self_data)
2412 )
2413 result = Variable(dims, new_data, attrs=attrs)
2414 return result
TypeError: '>' not supported between instances of 'cftime._cftime.datetime' and 'datetime.date'
[12]:
import cftime
tracks_subset = tracks.where(
tracks.time > cftime.datetime(1980, 1, 10, calendar="360_day"), drop=True
)
print(tracks_subset.time)
<xarray.DataArray 'time' (record: 46)>
array([cftime.datetime(2022, 1, 13, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 14, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 15, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 16, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 17, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 22, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
...
cftime.datetime(2022, 1, 24, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 25, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 26, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 27, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 6, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 12, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 28, 18, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 29, 0, 0, 0, 0, calendar='360_day', has_year_zero=True),
cftime.datetime(2022, 1, 29, 6, 0, 0, 0, calendar='360_day', has_year_zero=True)],
dtype=object)
Dimensions without coordinates: record
Subsetting by track#
To apply a criteria to each track in the dataset, use huracanpy.subset.trackswhere
[9]:
# Add storm category by pressure to each track and filter those that don't reach
# category 2
tracks = huracanpy.load(huracanpy.example_csv_file)
tracks["category"] = huracanpy.utils.category.get_pressure_cat(
tracks.slp, slp_units="Pa"
)
# Show the categories for each storm
# Storms 0 and 2 reach category 2, and storm 1 only reaches category 1
for track_id, track in tracks.groupby("track_id"):
print("track", track_id, "category", int(track.category.max()))
# Subset the tracks by category threshold which will remove track 1
track_subset = huracanpy.subset.trackswhere(
tracks, lambda track: track.category.max() >= 2
)
# Confirm that track 1 has been filtered out
print("\n", "tracks remaining -", set(track_subset.track_id.data))
track 0 category 2
track 1 category 1
track 2 category 2
tracks remaining - {0, 2}