Client Story - CME DataMine
The CME Group Data Web Store
CME Group, the second largest exchange group in the world, enlisted TickSmith to power CME DataMine, the online data web store that manages CME’s immense quantities of data for monetization. At the time, they did not have a standard system for handling and processing data and as a result, their data was dispersed over multiple departments. It seemed an impossible task to normalize, centralize, and extract the vast amount of historical data and make it easily accessible online.
The CME financial market company is distinct because they are also a mercantile exchange. Therefore, the requirements included normalization of both traditional and alternative stock data. The final online web store, CME DataMine, was to be a one-stop shop for investors to get instant access to both types of data.
Until 2012, CME was using hard disk drives to ship data to their clients, who would then download the data and return the disk within a month. With historical market data going back to the 1970s and beyond, CME was in need of a strong data management solution in order to migrate into the digital world.
TickSmith was an easy choice, being the experts in implementing big data technology for capital markets. “The CME Group has always offered customers a robust set of historical data, and working with TickSmith allows us to deliver it in a way that is as easy as shopping online,” says Tim Wheeler, Product Manager at CME Group.
Flexible data delivery
- Choice of delivery formats to best suit the client’s needs
- Quick and easy access anytime and anywhere
Integrated invoicing and licensing
- Significantly reduces processing time, connects rapidly
Virtual SFTP Interface (Provider SFTP)
- A revolutionary way for internal/external data providers to send data into the system
- 750 terabytes ingested and growing
- 400 gigabytes (compressed) ingested daily
“We don’t need to touch the actual data all that much, we mostly touch everything around the data,” says Nicolas Doyen, Product Manager at TickSmith. “Clients can enhance their data sales processes and they can choose specific pieces to offer on the system. One of our big value props is that we enable clients to offer a full range of products, our technology scales infinitely, and we are constantly building new features to improve it. Clients are able to build a platform that fits their needs, exactly as they need it.”
TickSmith installed the GOLD Platform for selling data on the CME DataMine project. The GOLD Platform’s technology is modular, which makes it simple for CME Group to scale and configure solutions for various use cases. The open flexibility in configurations along with the ability to add customized modules, would not only help CME Group organize their data for monetization, but would enable them to develop and redefine new product offerings. Building the infrastructure on premises was not feasible, as the ability to scale made for a strong case in deploying the platform on cloud technology.
CME Group is one of several capital markets to choose AWS as their cloud provider. AWS is outstanding in their ability to be flexible, cost-effective and to meet high security standards and encryption protocols. Amazon has one of the largest ecosystems of security partners and solutions, resulting in the highest measures for privacy and data security. The numbers are convincing, with an impressive statistic showing 0.00001% faults.
TickSmith is an Advanced AWS Partner in Financial Services Competency, with members on staff certified by AWS. As a result, the platform’s cloud infrastructure follows an accredited and secure workflow from ingestion to delivery, with entitlement and monitoring capabilities throughout the process. “The sophistication of their cloud offering, specifically regarding the range of services available and the flexibility of these services, makes AWS easily the best choice,” adds Doyen. “For actual clients, there’s proven success with Amazon regarding security. In terms of capabilities alone in enabling us to accomplish our goals, you couldn’t have a better cloud. They have everything we need.”
The CME DataMine web store is the official source of the most comprehensive price information available for CME Group markets, with more than 750 terabytes of historical market data. It provides a broad array of data types including Market Depth, End-of-Day and Block Trades among others, which can help customers discover insights to capture market opportunity. CME Group has also integrated third party datasets that provide data around the instruments traded on their exchange, such as crop health analytics and oil tanker fill estimates. A new use case for data monetization came along the way for CME CF Bitcoin Reference Rate (BRR) and CME CF Bitcoin Real-Time Index (BRTI), a standardized reference rate and spot price index for online bitcoin trading.
One of the main challenges of this project was the general management of big data distribution– a huge pain point for any company with massive amounts of data looking to monetize it. This required a revolutionary solution to the traditional SFTP (Secure File Transfer Protocol) model. Traditional SFTP approaches would require the data distributor to set up client-specific SFTP servers and continuously copy data into those servers. The original method was slow and accrued additional storage costs. TickSmith addressed this issue by devising a new method, dubbed Virtual SFTP, that removes the need for creation of client-specific SFTP server and data duplication.
Another difficulty is finding a solution to deliver specific time ranges of data. The raw datasets, as inputted by third-party providers, includes files with long date ranges and multiple parts. TickSmith is currently working on an ETL that will break up the historical data file into smaller files- even down to the hour, so customers can access relevant data in pertaining time ranges. This removes the barrier of entry for end-users and enables them to access relevant data at a lower price point.
“It makes it that your big data problem becomes a very, very small data problem for the end consumer. For instance, some financial institutions don’t have big data technology expertise, so dealing with a 40 terabyte historical dataset is hard for them,” says Doyen. “So if they’re able to get smaller slices of data, they can more easily leverage them. They can actually load these files on their computers, which they couldn’t do before.”
CME Group jumped on the big data bandwagon pretty early compared to other financial institutions and has since reaped major benefits. DataMine is essentially an online web store or data catalog, powered by TickSmith’s GOLD Platform for data selling. Customers can subscribe to and access data instantaneously via web interfaces and APIs. “Since the launch of DataMine, our modern data web store, we have increased our active customer base by more than 50%,” adds Wheeler.