Register now!

Sept. 28 - Oct. 3, 2013

DataWeek 2013 Conference and Expo
Browse DataWeek News and Submit Articles

Archive for the Category DataWeek 2012


Eva Ho Speaking at DataWeek 2012

September 26th, 2012 by Vikas Gupta

Eva Ho, our VP of Marketing and Business Operations, will be speaking on a panel tomorrow at DataWeek titled “Challenges of Building on Geo Data”. She will be joined by Ben Standefer (Urban Airship), Ankit Agarwal (Micello), and Peter Davie (TomTom). It will be from 12:00 pm to 12:40 pm in the SPUR Urban Center, 4th floor, Room 2. The session description is:

Why does there need to be dozens of sources of unique types of geo data? If so much geo data is free and public, why are we witnessing the emergence of technologies that do nothing but make it easier for us to find and integrate geo data? This panel will seek to understand why IP data, place data, venue data, or check-in data all have unique geo data challenges that require a diverse array of geo data providers.

Aerospike Leads a Powerhouse Panel at DataWeek 2012

By Aerospike

Three Aerospike-led sessions at DataWeek shine spotlight on the role of NoSQL database and key-value store technology in addressing the demands of today’s real-time, Web-scale consumer-facing applications

Mountain View, CA – September 19, 2012 – Unprecedented volumes of structured and unstructured data are being generated at high velocity via new consumer channels and touch-points. Now innovative enterprises are capitalizing on this information to personalize their interactions with consumers in real time. At the DataWeek 2012 conference, Aerospike (formerly Citrusleaf) will lead a panel of IT executives from three of these enterprises, who will share their best practices for implementing real-time big data management and analytics to enable immediate Web-scale interactions with consumers. The panel is one of three sessions Aerospike will present at DataWeek, which runs September 24-27, 2012, at the SPUR Urban Center in San Francisco, CA.

Rules Vs. Reality: The Roles of Big Data, September 25

In a plenary session, Aerospike CEO Bruce Fram will discuss how no one database can address the new generation of consumer-facing applications where the value lies in real-time responses based on the reality of what users are doing now. Instead these applications are best served by a combined use of NoSQL, Hadoop, SQL, and data analytic platforms.

When: Tuesday, September 25, 1:00 p.m.
Where: SPUR Urban Center, 2nd Floor, Main Hall

Enterprise IT Panel: Real-Time Big Data Analytics, September 26

Russell Sullivan, Aerospike principal engineer, will be joined by Michael Yudin, adMarketplace CTO; Alex Hooshmand, BlueKai chief strategy officer and senior vice president of operations; and Andrei Dunca, LiveRail CTO. Together, they will discuss how personalized online consumer interactions require businesses to reliably receive a query, analyze data, and respond in 100 milliseconds or less—even as they process hundreds of thousands of transactions per second (TPS). They also will review how they have implemented the Aerospike real-time database and key-value store in combination with Hadoop, SQL, and analytic platforms to gain immediate, actionable insights that were unavailable using those three technologies alone.

When: Wednesday, September 26, 3:00 p.m.
Where: SPUR Urban Center, 4th Floor, Room 1

Integrating Aerospike and Real-Time Bidding, September 26

Brian Bulkowski, Aerospike founder and CTO, will lead a workshop demonstrating how the Aerospike real-time NoSQL database and key-value store can integrate easily with real-time bidding engines like OpenX to respond with actionable insights in milliseconds. The session also will provide an overview of how the Aerospike database is architected with flash-optimized hardware to deliver predictable performance of 250k TPS per node, sub-millisecond query responses, and 100% uptime—while managing terabytes of data and billions of objects.

When: Wednesday, September 26, 4:00 p.m.
Where: SPUR Urban Center, 4th Floor, Room 1

About DataWeek

DataWeek, the latest event from the organizers of TechWeek and Data 2.0 Summit, will put a spotlight on data innovation across multiple disciplines, including advertising, social media, health care, the OpenGov movement, and personal data management. The four-day conference and labs will feature more than 200 speakers, and over 1,000 participating companies. The event is being hosted September 24-27, 2012 at the SPUR Center in downtown San Francisco. For more information, visit http://www.dataweek.co.

About Aerospike

Aerospike, Inc. offers the only real-time NoSQL database and key-value store that delivers predictable high performance for mission-critical, Web-scale applications. Aerospike’s flash-optimized, shared-nothing architecture scales linearly, consistently processing over 200k transactions per second per node with sub-millisecond latency. With automatic fail-over, replication, and cross data center synchronization, the Aerospike database reliably stores billions of objects and terabytes of data—while providing 100% uptime and a 10x improvement in TCO over other NoSQL databases. Customers accelerating their business with Aerospike include adMarketplace, eXelate, Sony’s So-net, and The Trade Desk.

AYATA CEO, Atanu Basu, to moderate DataWeek 2012 Expert Panel “Big Analytics in Oil and Gas”

by www.ayata.com

AYATA CEO, Atanu Basu, moderated the panel Big Data Analytics in Oil and Gas during the DataWeek 2012 Conference, September 22-27.

DataWeek is the industry’s first week-long conference and festival showcasing innovations behind the Data Revolution. DataWeek will feature over 200 speakers—they are industry thought leaders, innovators and evangelists to address the new data economy and its impact on business, technology, and society.

See more: http://www.ayata.com/company/news-events/events/74-press-2012-09-17

3 Things We Learned From Data Week 2012

by TaskUs on October 4, 2012 in Smart Reading

From September 24th to September 27th, San Francisco’s SPUR Urban Center pushed all of their art pieces on display to the corners as the Big Data industry filled it’s halls. Big Data has arrived and TaskUs was there to participate in a few days of learning, networking and plotting. Many questions were asked. How can we better acquire _________ data set? What’s the secret to integrating Big Data into the stack? How can we better utilize Big Data to precisely target the right customers? As we evolve in our understanding of Big Data and how to use it, it’s imperative to understand that there is no singular solution to any of the previously mentioned questions. Big Data is the 800 pound Gorilla in the room and to tame it, you are going to need a diverse set of solutions and workforces working together for one common goal.

1) The Smartest Companies Made Big Data A Focus Several Years Ago Companies like Autodesk made Big Data a priority and the results have been superior results. Have you ever mentioned tracking your market in a quarterly sales meeting? The task always seemed like a pipe dream at best. Autodesk has decided to make this dream a reality as they are compiling a remarkable database which definitively identifies and traces the ancestral history of every potential corporate customer. Considering that Autodesk makes AutoCAD, pretty much every corporation has some use for their suite of products. How are they accomplishing this tremendous feat? A clever mix of internal, outsourced and crowdsourced workforces working side-by-side (metaphorically) to create a verified, comprehensive and valuable database that would make any corporate salesforce drool.

2) Pandora has the data, knows what you like As members of media, data enthusiasts, engineers and Data Week attendees sat in on a fireside chat with Tom Conrad, the resounding reaction to “peeking behind the curtain” at Pandora was that of pure amazement. At Pandora, Conrad is the leading mind behind all things at the internet radio company with exception to the advertising division. How does one accurately predict taste? Music taste is perhaps the most fickle personal preference to predict. As the audience peppered Conrad with questions covering a diverse range of topics, the central message communicated was about focusing all of your efforts as a company on providing the best internet radio. I realize that message is commonly heard from successful startups and seems unremarkably simple, yet shockingly difficult to maintain as competitive music platforms like Spotify, Turntable.fm and a rumored/fabled Apple service continue to emerge. However, Pandora pays little attention. As explained by Conrad, Pandora views FM and AM radio as their actual competition. Current initiatives at Pandora focus mainly on putting Pandora into every new automobile manufactured. By effectively analyzing data acquired from their 58.3 million active users and sticking to a simple mantra of providing the absolute best internet radio service, Pandora continues to grow and defy meritless claims of eminent extinction.

3) Bringing Data To Market, There Are Businesses For That As I made my rounds from booth-to-booth in the exhibitors area, it became quite clear that Big Data has arrived. I realize Big Data has been around for a long time, but I am supporting my assertion with the shear number of businesses that have sprouted into existence to assist in bringing data to market. What do I mean by “bringing data to market”? There are two major markets that developed previously which led to the creation of a third market. The first market is data aggregation. Finding, acquiring and organizing data sets considered previously to be too large, too difficult to acquire and too complex to organize. The second market involves monetizing Big Data. Effectively, businesses package data in order to sell it to companies as valuable information, provide it directly to consumers in a matter that’s easy to digest or implement Big Data sets into the backend of applications used by consumers and/or businesses. The third market was born from the necessity to bridge the gap between the previously mentioned markets. For example, TechStar’s alum Precog specifically strives to bridge the gap between data scientists, data sources, business intelligence and engineers by providing a data warehousing & analysis platform. Similarly, outsourcing projects involving bringing big data to market continue to rise. Whether it’s manually verifying data like business contact information or aggregating data sets unattainable by machines, human outsourcing teams continue to work hand-in-hand with technical solutions to turn large data sets into profit machines!

In summary, Data Week was fantastic and gave me a strong sense of where the data industry is going. I am excited for the wonderful things to come and confident that companies like Pandora, Autodesk and Precog will continue to pave the way for their respective industries.

DataWeek 2012 Conference: Big Data Takes the Center Stage

by Saroj Kar on October 24, 2012

DataWeek 2012 Conference, the largest six-day conference held in San Francisco, California was focused on big data with good blends of panels, speakers, labs, and great new tools on the management of data. The event was an opportunity to learn, and also to compare experiences and exchange ideas with the various stakeholders.

All workshops offered by the various organizers were around the visualization of data, social data, big data or data mobility. Featured speakers included Jay Parikh, VP of Infrastructure at Facebook; Mike Olson, CEO of Cloudera; Ping Li, partner at Accel; and Rob High, CTO of IBM Watson. CloudTimes was a proud sponsor of this largest ever big data show.

Innovative Big Data Tools

Vitria was honored for its KPI Application Builder, the first application intelligence operations around the world, which allows continuous perspectives on the key performance indicators of an organization in minutes to share views and act upon the information.

The Operational Intelligence enables real-time perspectives in areas such as Big Data, events and complex business processes. The Operational Intelligence applications are lightweight and easily configurable web that provide insights based on continuous analysis of multiple data sources, both static and in motion.

Software AG announced that its flagship product “Terracotta BigMemory”, a wholly owned subsidiary, was awarded the 2012 DataWeek Top Innovator Award in the best “Big Data Technology” category.

Terracotta BigMemory distinguished itself in the world of big solutions in-memory data management for driving innovation in data management. The platform can manage terabytes of data and thousands of users at once accessing structured and semi-structured data in real-time.

Jay Parikh, VP of infrastructure engineering at Facebook, and Ping Li, a partner at venture capital firm Accel Partners discussed about the big data challenges and opportunities for startups and young companies. They shared that Hadoop, NoSQL databases, and other emerging open source big data platforms are quickly evolving and more new applications are expected to reach consumers on top of these platforms.

Both the leaders also shared that there is a need for more business intelligence, new analytics and data visualization tools because stats platforms like SAS and R for predictive analytics were not built for the big data world. Tableau Software has been wildly successful, but it was built before big data tools were even around.

Facebook is building real-time processing, graph analysis, data visualization and other analytics tools with open source projects like Memcached, MySQL and Hadoop. The company runs the largest Hadoop deployment in the world, with more than 100 petabytes of information.

Precog, the startup developer of infrastructure for data warehousing and analysis, offered platform-as-a-service solution. The company’s big data solutions are designed for developers and data scientists, and combine the scalability of big data platforms with the number-crunching power of statistical tools. By simplifying big data capture, storage and analysis through cloud APIs, developers can build highly sophisticated big data and analysis features into their Applications.

Precog is designed for capturing event-oriented data, such as behavioral interaction records, transactional records, historical aggregates, individual sensor measurements or any data set stored in traditional relational database management system (RDBMS).

Big Data Analytics

Al Shipp, CEO of leading video intelligence company 3VR shared how video intelligence tied to big data analytics cloud be a game changer for companies. He said industries from retail to banking to crime prevention can transform the consumer experience through video intelligence solutions. For example, retail store system can now provide products based on a buyer’s interests on images captured, or hotel staff can greet guests by names or know their preferences even before they walk into the hotel.

“In the universe of big data, everyone is talking about structured data, and a few are talking about unstructured data but no one is talking about how unstructured data can be really leveraged. Video data has the lion’s share in big data, is unstructured data, and is in front of us. It is the elephant in the room that gets bigger every day,” said Shipp. “Companies can harness all of this unstructured data for making smarter business decisions.”

Zillow, the data and analytics specialist for homes buyers, currently claims over a 30 million unique users per month by analyzing data through a database of over 100 million homes. The company offers wide range of analyses and reports featuring foreclosure rates, neighborhood values, and the projected cost of renting vs. buying.

Insights from DataWeek: San Francisco

October 5th, 2012 Jim Walker

I spent some time at the first ever DataWeek in San Francisco last week. It is a brand new show and it was very well-run, spread across a few cool spaces with an interesting mix of novice to experienced data professionals. They had a good blend of labs, speakers, panels and great networking opportunities. In all, it was great and a big thanks and kudos to the organizers.

I took part in a panel and also presented a three-hour overview of Hadoop. There were some good questions thrown at the panel but more interesting was the discussion over the three sessions. Before each presentation, I ran an informal survey of the room to get a sense of audience and there was an even mix of complete novice, those new to Hadoop and experienced practitioners.

Each session had lively discussion and great engagement. There were three segments to the presentation: Hadoop market overview, Intro to Hadoop, Hadoop usage patterns. I would also say that, in general there were three key points that the audience really seemed to focus on.

Forest/Trees :: Distribution/Project
There are Hadoop distributions and there is the Apache Hadoop project. When you are new to this world and learning through all the media, you can get lost in this terminology and the clarification of this point seemed important to the some of the Dataweek crowd.

The conversation went a little like this… the Apache Hadoop project comprises MapReduce and HDFS. Sometimes we refer to this as “core Hadoop” as it is the central focus of a Hadoop project. It provides redundant and reliable storage and distributed processing or compute. In order for Hadoop, the project, to become a more complete data platform, we, the community have created several related projects that make Hadoop more useful and dependable. When we package these projects (Hive, HBase, Pig, HCatalog, Ambari, ZooKepper, Oozie, etc…) with core Hadoop, this becomes a “distribution”.

A distribution came about because each project has its own release cycle and getting the right versions together is sometimes difficult. Also, a distribution will package the projects and provide an installer to make deployment much easier.

Insatiable Thirst for Use Cases
Design Patterns by Gamma et al. has and always will be one of the best developer books written. I like design patterns because they take a lot of data and boil it down to naturally occurring state. They make sense of chaos.

In the third hour of our overview, we presented some reusable patterns of use for Hadoop, namely, Refine, Explore and Enrich. With refine we apply a known process to a set of big data to extract results and use them in a business process. With explore, we use Hadoop to discover new information that was not attainable before. Often with explore, we will operationalize findings to be used in the refine patters. Finally with enrich we use big data to supplement and improve a user experience for an online application.

This session was scheduled for 45 minutes and went the full hour and beyond. There were a LOT of questions and interactions. The material was well received by the experienced professionals as it made sense of their projects and for those new to Hadoop it provided a good sense of where to start or how to approach this big data thing.

We Face Challenges
It seemed everyone wants to get started but are presented with challenges. There were really three areas of focus in this discussion, acquiring skills, managing a cluster and building a business case. The business case and validation of a project was interesting as some said you should just start with a project and run with it, while others advocated careful planning and a formal process.I guess in the end both sides were right.

It depends on your org and what they can stomach really.I will add my two cents however… Hadoop is open source and available to you today so use it and start addressing all three of the challenges in the immediate future.

As noted, Dataweek was a huge success and I am honored to have taken part in what surely will be a regular event. Congrats to the organizers on the birth of a new show.