Three Major Big Data Problems
Coping with Big Data: CSTA 2018 Recap
This past August, at the 2018 annual Canadian Traders Securities Association, TickSmith was invited to speak with industry titans on the importance of using big data technologies as financial firms accumulate more and more data.
The panel was moderated by Patrick Mcentyre, Managing Director, Electronic Trading & Services at National Bank Financial.
The panelists include:
- Bruce Bland, Head of Algorithmic Research, Fidessa
- Scott Campbell, VP – Execution Services, Bank of America Merrill Lynch
- Brennan Carley, Global Head, Enterprise, Thomson Reuters Financial & Risk
- Francis Wenzel, CEO, TickSmith
There is a new gold rush to access data that would have been too complex to analyze before the promises of AI and machine learning. But how is this mountain of data made usable? How do you cope with both the quantity and diversity of data?
Scott starts off by explaining that with each passing year, he’s seen the price of data become increasingly more expensive. Firms that accumulate useful data have the option to build a business monetizing it whether it be by explicitly selling the data or using it to derive actionable insights in the markets.
Patrick asks the panelists what specific data problems they see in financial organizations.
Brennan sees an increasing growth within 3 use-cases:
#1: Fitting data to a model & building backtesting capabilities
#2: Firms seeking Alpha (using additional datasets like news or satellite-based alternative data)
#3: Storing and providing data access for regulatory compliance
Scott explains that he has two major concerns:
#1: Ensuring data quality from legacy systems
#2: Turning data into actionable value. There is a large chasm between collecting data and figuring out how to use it.
Bruce spoke about his main concern stemming from building the storage component to real-time data feeds and querying it. It’s simple to extract whole data files but being able to filter and query very specific parts in large volumes of data requires well-defined business logic and robust big data technology. Value comes from summarizing large amounts of real-time data and distilling it into manageable and insightful files.
Patrick explains a large issue he faces is bridging the gap between technical data scientist skills and domain expertise. Scott responds that the ability for managers to meet a data scientist halfway is a much needed skill set.
Patrick asks the panelists about machine learning implementation.
Bruce comments that firms using machine learning to improve their business processes to generate revenue are close on the horizon. He estimates this will occur in 3 or 4 years.
Francis spoke about the requisite to training machine learning algorithms is a consolidated set of normalized and cleaned data. Firms must be able to master the ingestion, normalization, processing, and dissemination of large volumes of data for machine learning applications to be effective.
Patrick asks where panelists sees a bigger budget for spending on big data.
Scott explains that he sees an enormous budget for data security. He views the budget on data security as uncapped. If, for instance, client credit card information is leaked, the firm will get hit by regulators and incur enormous reputational damage.
Francis explains that where the bigger budget is depends on the management’s strategy focus at that point in time. He has seen organizations cycle through focusing on compliance and then shifting later to increasing the budget for traders seeking alpha.
The conversation shifts to data security & cloud implementation.
Francis explains that using cloud solutions can be incredibly secure now. Cloud solutions can be even more secure than in-house infrastructure if the cloud tools are configured properly. There’s been a huge shift from banks rejecting using cloud solutions a few years ago to them adopting cloud solutions now. Francis has even seen banks beginning to implement multi-cloud solutions to prevent a single-point of failure.
Francis explains further how cloud technology has evolved over time. From focus on processing data using elastic MapReduce to using containers deploying releases more efficiently to firms being cloud agnostic preventing single points of failure.
Scott agrees and sees cloud solutions as more cost-effective and scalable. In-house racks are never operating at capacity and hardware depreciates so using only the amount you need on a solution can bring more effective use of each dollar spent.
The panelists conclude with the point that future industry leaders will have to speak “data”. An analogy was made that much like a car owner in the 1960s had to be a bit of a “mechanic” to own a car… as monetizing data transitions to being mainstream, industry leaders have to be experts in data during this time!
We had a great time at CSTA this year and we hope to see everyone again next year!