How eBay tackles data
There are 350 million items for sale on eBay on any given day, and the site reaches roughly 100 million customers. These products and customers produce 150 terabytes of internal data at eBay on a daily basis.
With one of the largest data environments in the world, the eBay data infrastructure needs to run quickly and must be easily accessible, according to Tom Fastner, a senior member of technical staff and an architect at eBay.
"Data is the DNA of eBay. All we really have at eBay is data," he said during his international keynote at ITWeb's Business Intelligence Summit this week. "We are not building data mausoleums at eBay. We are not just catching data and storing it away. Our data must be accessible."
Having to manage such high volumes of data, eBay needs three repositories for data, said Fastner. These include an enterprise data warehouse for the handling of structured and unstructured data; a singularity system for AV testing; and Hadoop, which is mostly used for language processing. For eBay, using the right analytics tools to do the right tasks is important, Fastner said, particularly given its capacity requirements. These systems come at a high cost, but Fastner noted that the value for the business increases as costs increase.
According to Fastner, the eBay Web site is run much like real estate and it is the company's duty to drive value for the people trying to sell items online, while keeping them safe. As a company that works only online, Fastner stressed that trust is key. "Over many years, we have tried to understand the risks associated with being online and have played with different programs to keep customers safe."
Speaking about eBay's development process, Fastner noted that all new developments at eBay are created primarily with mobile in mind. "Today, whatever we do, it's always mobile first," he said. "Every application, every feature is developed on mobile first and is then exported into the PC environment." This is to cater to the needs of consumers, who are becoming more and more mobile, he said.
"We are certainly operating at scale. But it is not about size. It is really about activating the data and bringing it to the users," he concluded.
Follow @ITWeb and #ITWebBI2013 on Twitter for live updates from the event.