Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This reference architecture shows how to move data from mainframe and midrange systems to Azure. In this architecture, archived data is serviced and used only in the mainframe system. Azure is used only as a storage medium.
Architecture
Download a Visio file of this architecture.
To decide which method to use for moving data between the mainframe system and Azure storage, consider the frequency of data retrieval and the amount of data. Microsoft and third-party solutions are available:
- Microsoft solutions. - The Azure Data Factory FTP connector.
- The Data Factory copy activity, which can copy data to any Azure storage solution.
- Mainframe JCL to Azure Blob using Java, a custom solution for moving data from the mainframe system to Azure via Job Control Language (JCL). For more information, contact datasqlninja@microsoft.com.
 
- Third-party archive solutions. Solutions that you can easily integrate with mainframe systems, midrange systems, and Azure services. 
Workflow
- The Azure Data Factory FTP connector moves data from the mainframe system to Azure Blob Storage. This solution requires an intermediate virtual machine (VM) on which a self-hosted integration runtime is installed. 
- The Data Factory copy activity connects to the Db2 database to copy data into Azure storage. This solution also requires an intermediate VM on which a self-hosted integration runtime is installed. 
- The Microsoft Mainframe JCL to Azure Blob using Java custom solution moves data between the mainframe system and Blob Storage, and vice versa. This solution is based on Java and runs on Unix System Services on the mainframe. You can get this solution by contacting datasqlninja@microsoft.com. - You need to complete a one-time configuration of the solution. This configuration involves getting the Blob Storage access keys and moving required artifacts to the mainframe system. 
- A JCL submission moves files to and from the mainframe and Blob Storage. 
- Files are stored in binary format on Azure. You can configure the custom solution to convert EBCDIC to ASCII for simple data types. 
 
- Optionally, Azure Data Box can help you physically transfer mainframe data to Azure. This option is appropriate when a large amount of data needs to be migrated and online methods of transmission take too long. (For example, if migration takes weeks.) 
- Easy interaction with the mainframe or midrange environment is provided by third-party archive solutions. - These solutions interact with the mainframe and handle various mainframe parameters, like data types, record types, storage types, and access methods. They serve as a bridge between Azure and the mainframe. Some third-party solutions connect a storage drive to the mainframe and help transfer data to Azure. 
- Data is periodically synced and archived via the third-party archive solution. After the data is available via the third-party solution, the solution can easily push it to Azure by using available connectors. 
- Data is stored in Azure. 
- As needed, data is recalled from Azure back to the mainframe or midrange systems. 
Components
- Azure Data Factory is a cloud-based hybrid data integration service that you can use to create, schedule, and orchestrate your extract, transform, load (ETL) and extract, load, transfer (ELT) workflows. In this architecture, Azure Data Factory orchestrates the movement of data from mainframe systems to Azure storage by using FTP connectors and copy activities. 
- Azure Files is a cloud storage service that provides simple and secure serverless cloud file shares. These components are used for synchronization and data retention. In this architecture, Azure Files enables file-based data archiving and provides NFS/SMB access for mainframe systems to store and retrieve archived data. 
- Azure storage is a cloud platform that provides scalable, secure cloud storage for your data, apps, and workloads. In this architecture, Azure storage serves as the primary destination for archived mainframe data and provides cost-effective, long-term storage and lifecycle management capabilities. 
- Data Box is a physical device that you can use to move on-premises data to Azure. In this architecture, Data Box provides an option for physically transferring large volumes of mainframe data to Azure when online methods take too long. 
Alternatives
You can use the classic method of moving the data out of the mainframe or midrange system via FTP. Data Factory provides an FTP connector that you can use to archive the data on Azure.
Scenario details
Mainframe and midrange systems generate, process, and store huge amounts of data. When this data gets old, it's not typically useful. However, compliance and regulatory rules sometimes require this data to be stored for a certain number of years, so archiving it is critical. By archiving this data, you can reduce costs and optimize resources. Archiving data also helps with data analytics and provides a history of your data.
Potential use cases
Archiving data to the cloud can help you:
- Free up storage resources in mainframe and midrange systems.
- Optimize performance for queries by storing only relevant data on the active system.
- Reduce operational costs by storing data in a more economical way.
- Use archived data for analytics to create new opportunities and make better business decisions.
Recommendations
Depending on how you use data, you might want to convert it to ASCII from binary and then upload it to Azure. Doing so makes analytics easier on Azure.
Considerations
These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.
- Complex data types on the mainframe must be handled during archive.
- Application subject matter experts can identify which data needs to be archived.
- To determine the amount of time between syncs, consider factors like business criticality, compliance needs, and frequency of data access.
Cost Optimization
Cost Optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.
Use the Azure pricing calculator to estimate the cost of implementing this solution.
Third-party archive solutions
Some third-party solutions are available on Azure Marketplace. Each of these solutions requires unique configuration. Setting up these solutions is one of the primary tasks of implementing this architecture.
Azure storage
Azure has a variety of options for different application and technical requirements, like frequent versus infrequent access, and structured versus unstructured data. You can set up various storage lifecycle configurations in Azure storage. You can define the rules to manage the lifecycle. For an overview, see Configure a lifecycle management policy.
Data recall
Recall of archived data is an important aspect of archive solutions. Few of the third-party solutions provide a seamless experience for recalling archived data. It's as simple as running a command on-premises. The third-party agent automatically gets the data from Azure and ingests it back into the mainframe system.
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal author:
- Pratim Dasgupta | Engineering Architect
Other contributors:
- Ashish Khandelwal | Senior Engineering Architect Manager
- Ramanath Nayak | Engineering Architect
Next steps
For more information, contact Azure Data Engineering - Mainframe/Midrange Modernization.
See these resources:
- Azure Database Migration Guides
- What is Azure Data Factory?
- Introduction to Azure Storage
- What is Azure Files?
- What is Azure Data Box?
- Explore Azure Storage services