Data loaders
Collection of data loaders and utils for it
Yahoo
Loader for dataset provided by yahoo.
Data may be downloaded by script
main()
- Expected dataset structure:
- path to Yahoo data folder with structureYahoo├── quarterly│ ├── AAPL.csv│ ├── FB.csv│ └── …├── base├── AAPL.json├── FB.json└── …
- class ml_investment.data_loaders.yahoo.YahooBaseData(data_path: str)[source]
Bases:
object
Loader for base information about company(like sector, industry etc)
- Parameters
data_path – path to
yahoo
dataset folder
- class ml_investment.data_loaders.yahoo.YahooQuarterlyData(data_path: str, quarter_count: Optional[int] = None)[source]
Bases:
object
Loader for quartely fundamental information about companies(debt, revenue etc)
- Parameters
data_path – path to
yahoo
dataset folderquarter_count – maximum number of last quarters to return. Resulted number may be less due to short history in some companies
SF1
Loaders for dataset provided by https://www.quandl.com/databases/SF1/data. Data may be downloaded by script
main()
- Expected structure of dataset
- SF1├── core_fundamental│ ├── AAPL.json│ ├── FB.json│ └── …├── daily│ ├── AAPL.json│ ├── FB.json│ └── …└── tickers.zip
- class ml_investment.data_loaders.sf1.SF1BaseData(data_path: Optional[str] = None)[source]
Bases:
object
Load base information about company(like sector, industry etc)
- Parameters
data_path – path to
sf1
dataset folder If None, than will be usedsf1_data_path
from ~/.ml_investment/config.json
- class ml_investment.data_loaders.sf1.SF1DailyData(data_path: Optional[str] = None, days_count: Optional[int] = None)[source]
Bases:
object
Load daily information about company(marketcap, pe etc)
- Parameters
data_path – path to
sf1
dataset folder If None, than will be usedsf1_data_path
from ~/.ml_investment/config.jsondays_count – maximum number of last days to return. Resulted number may be less due to short history in some companies
- class ml_investment.data_loaders.sf1.SF1QuarterlyData(data_path: Optional[str] = None, quarter_count: Optional[int] = None, dimension: Optional[str] = 'ARQ')[source]
Bases:
object
Loader for quartely fundamental information about companies(debt, revenue etc)
- Parameters
data_path – path to
sf1
dataset folder If None, than will be usedsf1_data_path
from ~/.ml_investment/config.jsonquarter_count – maximum number of last quarters to return. Resulted number may be less due to short history in some companies
dimension – one of
['MRY', 'MRT', 'MRQ', 'ARY', 'ART', 'ARQ']
. SF1 dataset-based parameter
- class ml_investment.data_loaders.sf1.SF1SNP500Data(data_path: Optional[str] = None)[source]
Bases:
object
S&P500 historical constituents
- Parameters
data_path – path to
sf1
dataset folder If None, than will be usedsf1_data_path
from ~/.ml_investment/config.json
- load(index: Optional[List[numpy.datetime64]] = None) pandas.core.frame.DataFrame [source]
- Parameters
index – list of dates to load constituents for, i.e.
[np.datetime64('2018-01-01'), np.datetime64('2018-05-10')]
If there are no such date, than nearest past date will be used. ORNone
(loading for all dates when constituents was changed)- Returns
constituents information
- Return type
pd.DataFrame
- ml_investment.data_loaders.sf1.translate_currency(df: pandas.core.frame.DataFrame, columns: Optional[List[str]] = None)[source]
Translate currency of columns to USD according course information in appropriate columns(like debtusd-debt)
- Parameters
df – quarterly-based data
columns – columns to translate currency
- Returns
result with the same columns and shapes but with converted currency in columns
- Return type
pd.DataFrame
Quandl Commodities
Loader for commodities price information from
https://blog.quandl.com/api-for-commodity-data.
Data may be downloaded by script
main()
- Expected dataset structure
- commodities├── LBMA_GOLD.json├── CHRIS_CME_CL1.json└── …
- class ml_investment.data_loaders.quandl_commodities.QuandlCommoditiesData(data_path: Optional[str] = None)[source]
Bases:
object
Loader for commodities price information.
- data_path:
path to
quandl_commodities
dataset folder If None, than will be usedcommodities_data_path
from ~/.ml_investment/config.json
Daily Price Bars
Loader for daily bars price information.
Data may be downloaded by script
main()
- Expected dataset structure
- daily_bars├── AAPL.csv├── TSLA.csv└── …
- class ml_investment.data_loaders.daily_bars.DailyBarsData(data_path: Optional[str] = None, days_count: Optional[int] = None)[source]
Bases:
object
Loader for daywise price bars.
- Parameters
data_path – path to
daily_bars
dataset folder If None, than will be useddaily_bars_data_path
from ~/.ml_investment/config.jsondays_count – maximum number of last days to return. Resulted number may be less due to short history in some companies