Pragmatic Works Nerd News

Expanding the Azure Data Box Family

Written by Chris Seferlis | Feb 21, 2019

In a previous blog I introduced Azure Data Box. Today I’d like to talk about how Microsoft is expanding the Azure Data Box family by introducing you to the Azure Data Box Gateway and the Azure Data Box Edge devices.

Until now the Data Box Family has been the disc, the box and the heavy. Each have their own limits for storage but are designed to improve your way of uploading massive amounts of data into Azure, without having to wait for it to travel across the wire or saturate your bandwidth (consider that the offline method).

Microsoft learned from customers that they want a better way to sync their local storage directly with Azure storage for operations like archival and disaster recovery. Here’s where Azure Data Box Gateway comes in.

The Data Box Gateway is a cloud storage gateway device that resides on premises and sends your image, media and other data directly to Azure.

  • The Gateway is a virtual machine provisioned in your Hypervisor (VMware or Hyper V) where you write the data directly to this virtual device using the NFS or SMB protocols, which it then sends to Azure.
  • One use case for the Data Box Gateway is for things like continuously ingesting massive amounts of data. So, we have a local data source that requires large data amounts and capacities and we can stream those and sync them directly with our Azure storage.
  • Another use case would be for a cloud archival of data in a secure and efficient way. If you then think about the incremental data transfer over the network after the initial bulk transfer is done using the Data Box of your choice for direct tie in to the same Azure storage container that you’re using for your Data Box.

Azure Data Box Edge is a storage solution that allows you to process data and send it over the network to Azure.

  • Data Box Edge uses a physical device supplied by Microsoft to accelerate the secure data transfer.
  • The device resides on premises in your network stack and you write data to it (also using NFS or SMB.)
  • It is additionally equipped with AI enabled Edge computing capabilities which help to analyze, process or filter data as it moves to Azure block blob, page blob or Azure files.
  • It has the appropriate chips to process intelligent learning (artificial intelligence, machine learning, deep learning and such).
  • Use cases are for things like pre-processing data. So, we can analyze data from on premises or IOT devices to get faster information about the data. That pre-processing will allow us to do things like aggregating your data before it gets sent to Azure or modifying data, such as taking out PII.
  • You can also subset and transfer the data needed for deeper analytics in the cloud.
  • Additionally, you can analyze and react to IOT events. So, if you’re running IOT devices on prem and you want the ability to be quicker to respond when those events occur, this is a great way to handle that.
  • Another great use case is you can run machine learning models to get quick results that can be acted on before the data is sent to the cloud.
  • With these IOT use cases, you don’t have to wait for the data to be transmitted over the wire, do any of the munging happening up in Azure and then return results. You can return those results on the fly in real time and react more quickly.
  • Eventually the full data set is transferred to continue and help you to retain and improve any of your machine learning models. You can continually feed it data and have those models trained repeatedly, thus learning to be more concise over time.

The Data Box family is a very cool technology by having an online version to further extend its capabilities. If you’re interested in the Data Box family or Azure or anything to do with data in the Microsoft Stack, we’re here to help. Click the link below or contact us – our experts can help you use your data to take your business from good to great.