Extended filesystem programming capabilities in Azure Data Lake Storage

Since the general availability of Azure Data Lake Storage Gen2 in February 2019, customers have been getting insights at cloud scale faster than ever before. Integration to analytics engines is critical for their analytics workloads and equally important is the ability to programmatically ingest, manage, and analyze data. This ability is critical for key areas of enterprise data lakes such as data ingestion, event-driven big data platforms, machine learning, and advanced analytics. Programmatic access is possible today using Azure Data Lake Storage Gen2 REST APIs or Blob REST APIs. In addition, customers can enable continuous integration and continuous delivery (CI/CD) pipelines using Blob PowerShell and CLI capabilities via multi-protocol access. As part of the journey to enable our developer ecosystem, our goal is to make customer application development easier than ever before.

We are excited to announce the public preview of .NET SDK, Python SDK, Java SDK, PowerShell, and CLI for filesystem operations for Azure Data Lake Storage Gen2. Customers who are used to the familiar filesystem programming model can now implement this model using .NET, Python, and Java SDKs. Customers can also now incorporate these filesystem operations into their CI/CD pipelines using PowerShell and CLI, thereby enriching CI/CD pipeline automation for big data workloads on Azure Data Lake Storage Gen2. As part of this preview, the SDKs, PowerShell, and CLI include support for CRUD operations for filesystems, directories, files, and permissions through filesystem semantics for Azure Data Lake Storage Gen2.

Detailed reference documentation for all these filesystem semantics are provided in the links below. These links will also help you get started and provide feedback.

This public preview is available globally in all regions. Your participation and feedback are critical to help us enrich your development experience. Join us in our journey.