How to get started with TickVault’s Python API
By Bogdan Istrate
Big Data: everyone talks about it, but few know how to leverage it for a competitive advantage. At TickSmith, we live and breathe Big Data, and we know that easy access to the wealth of data created every day is every bit as important as the processing done on it. That’s why we’ve created a Python API to complement our TickVault data processing, analytics, and storage platform. This easy-to-use API gives developers fine-grained access to the historical marketplace data contained in TickVault’s Tick-by-Tick database, including top of book, derived content and market statistics. We’ve also made it incredibly easy to do analytics on this data using the pandas library, a high-performance Python package.
This article will show you everything you need to know to get started using the TickVault Python API to extract the data you’re looking for and run predictive analytics, backtesting, risk management or (almost) anything you can dream of. Portals with accessible data include NASDAQ-CX and Thomson Reuters Tick History. The source code is also available on our GitHub.
In this example, we’ll first look at the bid/ask spread for Toronto Dominion Bank as it was traded on CXC on March 2nd 2015. Then we’ll look at the intraday 2-minute Time-Weighted Average Spread (TWAS) Percent for 2 different equities with Thomson Reuters’ data.
The first step is to register an account on the demonstration site we’ll be using, which contains data from Nasdaq CX.
Navigate to nasdaq-cx.ticksmith.com, click the “Free Trial Now” button, and enter the requisite information.
Once you’ve got your free account set up, sign in and have a look around.
When you’re ready to continue this tutorial, we’ll need to generate an API key to unlock access to the Python API. Click on your name in the top right-hand corner of the page and click “Manage API Key”. On the page you land on, click “Generate New API Key”. You will use this key wherever you see in the following examples.
pip install tickvault-python-api
Following Along in a Jupyter Notebook
Depending on your level of comfort with Python, you can run the code examples below in your own development environment (installing dependencies as needed), or install Jupyter Notebook and run the examples there. For the latter path, download and install Anaconda, which provides most of the dependencies required below, and type into your terminal:
This will launch a browser window running Jupyter where you can click “Upload” to import our Jupyter notebook. Once you’ve replaced the placeholders with your own credentials, you can simply click the “play” button to run each code example.
Accessing TickVault | NASDAQ CX
Once the API is installed, you can access the data within TickVault using the email you signed up with and the API key you generated earlier.
To see what data you have access to, call the ‘datasets’ method:
The ‘describe’ method can be called on from a dataset above to see the available columns:
To access the bid and ask prices for TD quotes when the ask and bid sizes were greater than 10, we will query the HiTS dataset:
Let’s convert the result to a pandas DataFrame for analytics (pandas is automatically imported when the api is imported):
To visualise the result:
Accessing TickVault | Thomson Reuters
The TickVault demo site (in partnership with Thomson Reuters) provides free access to 3 months of data from 8 exchanges worldwide. We’ll use this to compare the intraday 2-minute Time-Weighted Average Spread (TWAS) Percent for 2 different equities, one very volatile (VRX) and one less so (IBM), over the span of Jan – Feb 2016.
To access the dataset, follow the same registration and API key steps as above at https://trdata.tickvault.com, and query the Time and Sales dataset.
(Be aware that the fields can differ between datasets, so the describe method is useful when deciding which fields to query)
Now that you have the result of the query in a data frame, you can run any analytics your little heart desires. Here, we’ll show you our calculation of the 2 minute mean TWAS% (Time-Weighted Average Spread divided by the Volume-Weighted Average Price). The actual implementation is left as an exercise to the reader.
The resulting graph shows that the mean intraday spread is significantly higher for the more volatile instrument.
It’s that (relatively) easy!
As you can see, the Python API can be used with a multitude of data science applications, as well as the aforementioned applications in backtesting algorithmic trading strategies, risk management, and satisfying compliance requirements, among other financial data analytics.
To learn more about our platforms, simply give us a call.