‹ back to all posts
Big Data Technology: In-House vs. Vendors

Big Data Technology: In-House vs. Vendors

For retailers, the specter of big data is one that is constantly looming. Companies are working hard delving into the omni-channel arms race as they try to fend off behemoths like Amazon. Some companies are going so far as to deploy massive amounts of resources into developing their own big data solutions in an attempt to go toe to toe with the retail giant. The natural question that retailers face is what exactly they need to build in house vs. what they can, and probably should, outsource to vendors.

With the proliferation of “software as a service” (SaaS) model, it’s becoming increasingly simpler and faster to deploy new solutions in an enterprise setting. This naturally results in ever-increasing innovation in the industry, as old solutions are easily replaced with the more novel and more effective ones in mere weeks. At the same time, large retailers have a natural desire to build a lot of things in house in the same way Amazon made heavy investments in the internal technology to power everything from the automated fulfillment to user recommendations on site. However, it’s important to realize that not everything can or should be built in house. Retailers should think of the infrastructure as the data platform on top of which vendors innovate in the same way Apple and Android platforms allow individual developers to innovate with the apps. We believe that algorithms in the cloud will become the most common SaaS applications in the next few years. Retailers who treat algorithms as “core competency” and limit their development to internal teams will only stifle innovation and fall behind in the long term. Here, we outline the reasons why.


Great algorithmic solutions require immense talent. The war for talent these days is fierce, especially in data science. Data scientists are typically PhDs educated in computer science, statistics, or mathematics and require salaries of over $150,000. With the limited supply of qualified engineers and data scientists in the market, these engineers are more frequently lured by either startups or the technical giants like Amazon, Google and Facebook. Unfortunately, most brick-and-mortar and online retailers don’t happen to be the sexiest destination for the top-notch engineers. As a result, retailers have to compensate by paying more than double the already exorbitant salaries.

Doing a simple math shows that a team of 20 data scientists and engineers can easily cost a retailer an excess of $4M a year. That is before factoring in the cost of recruitment, on-boarding, retention of this talent and acquiring and supporting any infrastructure to support the development of the solution. In a comparison, a typical SaaS solution will have a price tag of less than $1M a year (this is probably an absolute upper bound: typical fees will be lower than $500K). It’s not hard to see massive savings that retailers achieve by working with vendors.

Speed to market and flexibility

For any technological venture, speed to market is key to determining overall success. This includes the development of internal technology. From project inception to launch, creating a big data solution can take as much as 2-3 full years. That is two plus years for a solution you need now. And while the need for an immediate solution is a sizable, the lifecycle of technology isn't. A two year wait time can create one of two problems. Either your newly developed solution is nearly outdated at launch or you become caught in an unending cycle of redesign in an attempt to get ahead of a rapidly progressing technological landscape.

Meanwhile with the wide adoption of cloud-based SaaS model, speed of integration and deployment for third party solutions has never been faster. Jetlore, as an example, can be integrated and deployed in a mere twenty days meaning that an immediate need is satisfied quickly and with cutting edge technology that is in a constant state of being improved upon (algorithms are constantly optimized and tuned across the largest retailers in the world). More importantly, third party vendors also provide a level of flexibility not available with an internally built system. Removing and replacing third party SaaS solutions is extremely simple without the fear of exorbitant sunk costs and internal politics.


Technological and algorithmic advances occur extremely fast. Throughout the history it has become evident that competition pays a vital role in innovation. SaaS model makes it both easy to deploy as well as easy to replace a solution. As a result, vendors are under a constant pressure to innovate and improve. When there is an internal team, the choice has already been made, hence there is no competition. Once a solution is built and deployed, the goal of the team is to maintain and improve the solution. But you will never really know if the solution your internal team has built is competitive on the market.

By working with the third party SaaS vendors, retailers are able to evaluate and deploy many cutting edge solutions in a short timeframe with little investment on their side. These solutions are in use by many other retailers and the vendors are under a constant scrutiny by their customers to innovate and improve. Trying to build these things in house is not only cost prohibitive and too slow, but, most importantly, limits innovation and will in the long run make your business less flexible and agile.

This doesn’t mean that retailers should fully outsource all their technology to vendors. When people talk about technology in the context of big data, they refer to both the infrastructure to store and process data as well as the algorithms to interpret data and make predictions. Infrastructure includes storing omni-channel customer data like purchases or claimed coupons in a secure, privacy-preserving way and making this data accessible to supporting applications. Algorithms are effectively applications on top of the infrastructure that leverage the data to do demand forecasting, churn prediction, dynamic pricing, or product personalization and targeting. Algorithms are built on top of the data infrastructure the same way apps are built on top of the operating system. Hence, it is imperative for retailers to invest internal resources and time to build secure, efficient, and scalable infrastructure. The right infrastructure with external APIs and security (encryption of sensitive data) will enable your company to leverage the cutting edge technology from vendors and continuously innovate. This will allow your company to focus the attention and expertise on core business functions instead of attempting to become experts in unrelated fields. For any business, resources like money, time and brain space are finite. Winning businesses know how to win by pointing those resources in the right places.

Photo: Connie Zhou