New tool will quickly analyze and visualize big data sets
Google released a new tool Tuesday that allows developers to easily and quickly analyze, visualize and publish data stored upon their cloud.
Launched in Paris, Google Cloud Datalab is the latest in a string of products from cloud providers that seek to give users an interactive tool to explore data in what Google calls “a fast, simple and cost-effective way.” Users can take data from Google Cloud’s data analytics service BigQuery, Compute Engine and Cloud Storage, and quickly analyze it no matter what language — like Python, SQL or JavaScript — is specified.
The service builds upon Jupyter (formerly IPython), a popular Web application that allows users to create “notebooks,” documents that contain live code, equations, visualizations and text created with associated data. The visualizations are scalable, able to handle data sets from megabytes to gigabytes.
The lab also allows users to build and test data pipelines for deployment to BigQuery, and create and deploy machine learning models.
The cloud lab is currently free to test, as the project is still in beta. However, users must go through Google’s App Engine application and will have to pay for features after the beta ends, including uses of BigQuery and any related storage. Google did not release pricing information.
Google is touting the service as open source and collaborative, with Git-based source control of notebooks, the option to sync with non-Google source code repositories, and the ability to fork and/or submit pull requests on the source code through GitHub.
This service comes as cloud providers are trying to make data models easier to create. Last week, Amazon Web Services announced Amazon QuickSight, a business intelligence tool that allows users to take data stored on AWS’s cloud and rapidly construct data visualizations. Other large companies like SAP, IBM, Microsoft and Salesforce have similar data analysis tools.
A number of these companies want to help the federal government harness the power of the terabytes of data it produces every day. Earlier this year, Amazon Web Services, Google, IBM, Microsoft and the Open Cloud Consortium entered into a cooperative research and development agreement (CRADA) with the Commerce Department that will push National Oceanic and Atmospheric Administration data into the companies’ respective cloud platforms to increase the quantity of and speed at which the data becomes publicly available.
Ian Kailn, Chief Data Officer for the Commerce Department didn’t specifically say the agency will use this tool in that CRADA, but sees tools like Datalab as another way to “unlock the potential in government data.”
“One of the benefits of these new cloud tools is to not only perform big data analysis, but to also make such tools more accessible to more users,” Kalin told FedScoop. “Commerce has massive data systems – including those from the National Weather Service, the Patent and Trademark Office, and the Census Bureau – that can benefit from private-sector big data technologies. Because the data ecosystem is constantly evolving, emerging technologies, like these new cloud tools, can be applied to help expand the value and reach of official government data sets in order to build new products and services.”