How to get started with TickVault’s Python API

How to get started with TickVault’s Python API

2017-08-11T16:25:46+00:00 May 16th, 2017|Blog|

By Bogdan Istrate

Big Data: everyone talks about it, but few know how to leverage it for a competitive advantage. At TickSmith, we live and breathe Big Data, and we know that easy access to the wealth of data created every day is every bit as important as the processing done on it. That’s why we’ve created a Python API to complement our TickVault data processing, analytics, and storage platform. This easy-to-use API gives developers fine-grained access to the historical marketplace data contained in TickVault’s Tick-by-Tick database, including top of book, derived content and market statistics. We’ve also made it incredibly easy to do analytics on this data using the pandas library, a high-performance Python package.

This article will show you everything you need to know to get started using the TickVault Python API to extract the data you’re looking for and run predictive analytics, backtesting, risk management or (almost) anything you can dream of. Portals with accessible data include NASDAQ-CX and Thomson Reuters Tick History. The source code is also available on our GitHub.

In this example, we’ll first look at the bid/ask spread for Toronto Dominion Bank as it was traded on CXC on March 2nd 2015. Then we’ll look at the intraday 2-minute Time-Weighted Average Spread (TWAS) Percent for 2 different equities with Thomson Reuters’ data.

Registration

The first step is to register an account on the demonstration site we’ll be using, which contains data from Nasdaq CX.

Navigate to nasdaq-cx.ticksmith.com, click the “Free Trial Now” button, and enter the requisite information.

Once you’ve got your free account set up, sign in and have a look around.

When you’re ready to continue this tutorial, we’ll need to generate an API key to unlock access to the Python API. Click on your name in the top right-hand corner of the page and click “Manage API Key”. On the page you land on, click “Generate New API Key”. You will use this key wherever you see in the following examples.

Installation

The Python package is installable from the Python Package Index (PyPI). Python 3.x and matplotlib are required for this tutorial:

pip install tickvault-python-api

Following Along in a Jupyter Notebook

Depending on your level of comfort with Python, you can run the code examples below in your own development environment (installing dependencies as needed), or install Jupyter Notebook and run the examples there. For the latter path, download and install Anaconda, which provides most of the dependencies required below, and type into your terminal:

jupyter notebook

This will launch a browser window running Jupyter where you can click “Upload” to import our Jupyter notebook. Once you’ve replaced the placeholders with your own credentials, you can simply click the “play” button to run each code example.

Accessing TickVault | NASDAQ CX

Once the API is installed, you can access the data within TickVault using the email you signed up with and the API key you generated earlier.

from tickvaultpythonapi.nasdaqcxclient import NasdaqCxClient
 
nasdaq = NasdaqCxClient(user_name="<USER_NAME>", secret_key="<API_KEY>")

To see what data you have access to, call the ‘datasets’ method:

nasdaq.datasets() 

Out: ['cx_eod_stats', 'cx_hits', 'cx_rollup_1000', 'cx_rollup_60000']

The ‘describe’ method can be called on from a dataset above to see the available columns:

nasdaq.describe('cx_hits')

Out: {'ask_size': 'INT',
      'askprice': 'DECIMAL', 
      'bid_size': 'INT', 
      'bidprice': 'DECIMAL', 
      'buyerid': 'STRING', 
      'crosstype': 'STRING', 
      'day': 'STRING', 
      'execution_venue': 'STRING', 
      'halted': 'STRING', 
      'lastprice': 'DECIMAL', 
      'line_type': 'STRING', 
      'linenumber': 'BIGINT', 
      'listing_market': 'STRING', 
      'sellerid': 'STRING', 
      'short_exempt': 'STRING', 
      'source': 'STRING', 
      'ticker': 'STRING', 
      'trade_attribute': 'STRING', 
      'trade_initiator_side': 'STRING', 
      'traderef': 'STRING', 
      'ts': 'BIGINT', 
      'volume': 'BIGINT', 
      'yyyymmdd': 'STRING'}

To access the bid and ask prices for TD quotes when the ask and bid sizes were greater than 10, we will query the HiTS dataset:

result = nasdaq.query_hits(source="CHIX", tickers="td",
                           fields="ts,askprice,bidprice",
                           start_time=20150302093000, 
                           end_time=20150302160000,
                           predicates="ask_size > 10 and bid_size > 10 and line_type like Q",
                           limit=1000000)

Let’s convert the result to a pandas DataFrame for analytics (pandas is automatically imported when the api is imported):

df = nasdaq.as_dataframe(result) 
df.info() 

Out: 
<class 'pandas.core.frame.DataFrame'> 
DatetimeIndex: 126990 entries, 2015-03-02 09:30:00.004000 to 2015-03-02 15:59:59.940000 
Data columns (total 2 columns): 
askprice 126990 non-null float64 
bidprice 126990 non-null float64 
dtypes: float64(2)
memory usage: 2.9 MB
None

To visualise the result:

df.plot()
Intraday Chart TD bid-ask price 20150302

Accessing TickVault | Thomson Reuters

The TickVault demo site (in partnership with Thomson Reuters) provides free access to 3 months of data from 8 exchanges worldwide. We’ll use this to compare the intraday 2-minute Time-Weighted Average Spread (TWAS) Percent for 2 different equities, one very volatile (VRX) and one less so (IBM), over the span of Jan – Feb 2016.

To access the dataset, follow the same registration and API key steps as above at https://trdata.tickvault.com, and query the Time and Sales dataset.

(Be aware that the fields can differ between datasets, so the describe method is useful when deciding which fields to query)

from tickvaultpythonapi.trclient import TrClient

trdata = TrClient(user_name="<USER_NAME>", secret_key="<API_KEY>")
result = trdata.query_tas(source="NYS", tickers="IBM.N,VRX.N", 
                          fields="tstamp,ric,price,volume,askprice,bidprice", 
                          start_time=20160101000000, 
                          end_time=20160228000000, 
                          limit=100000000)

df = trdata.as_dataframe(result, index="tstamp")

Now that you have the result of the query in a data frame, you can run any analytics your little heart desires. Here, we’ll show you our calculation of the 2 minute mean TWAS% (Time-Weighted Average Spread divided by the Volume-Weighted Average Price). The actual implementation is left as an exercise to the reader.

The resulting graph shows that the mean intraday spread is significantly higher for the more volatile instrument.

Time Weighted Average Spread Percent on NYSE

It’s that (relatively) easy!

As you can see, the Python API can be used with a multitude of data science applications, as well as the aforementioned applications in backtesting algorithmic trading strategies, risk management, and satisfying compliance requirements, among other financial data analytics.

To learn more about our TickVault platform, click here (or give us a call)