Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
SQL Server
Azure SQL Database
Overview
This sample showcases the in-memory OLTP feature. It shows memory-optimized tables and natively compiled stored procedures, and can be used to demonstrate performance benefits of in-memory OLTP.
Note
To view this article for SQL Server 2014 (12.x), see Extensions to AdventureWorks to Demonstrate In-Memory OLTP.
The sample migrates five tables in the AdventureWorks2022 database to memory-optimized, and it includes a demo workload for sales order processing. You can use this demo workload to see the performance benefit of using in-memory OLTP on your server.
In the description of the sample, we discuss the tradeoffs that were made in migrating the tables to in-memory OLTP to account for the features that aren't (yet) supported for memory-optimized tables.
The documentation of this sample is structured as follows:
Prerequisites for installing the sample and running the demo workload.
Instructions for Install the In-Memory OLTP sample based on AdventureWorks.
Description of the sample tables and procedures - includes descriptions of the tables and procedures added to
AdventureWorks2022by the in-memory OLTP sample, as well as considerations for migrating some of the originalAdventureWorks2022tables to be memory-optimized.Instructions for performing Performance measurements using the demo workload - includes instructions for installing and running ostress, a tool using for driving the workload, and running the demo workload itself.
Prerequisites
-
SQL Server 2016 (13.x)
For performance testing, a server with specifications similar to your production environment. For this particular sample, you should have at least 16 GB of memory available to SQL Server. For general guidelines on hardware for in-memory OLTP, see the following blog post: Hardware considerations for In-Memory OLTP in SQL Server
Install the in-memory OLTP sample based on AdventureWorks
Follow these steps to install the sample:
Download
AdventureWorks2016_EXT.bakandSQLServer2016Samples.zipfrom: https://github.com/microsoft/sql-server-samples/releases/tag/adventureworks to a local folder, for exampleC:\Temp.Restore the database backup using Transact-SQL or SQL Server Management Studio:
Identify the target folder and filename for the data file, for example:
H:\DATA\AdventureWorks2022_Data.mdfIdentify the target folder and filename for the log file, for example:
I:\DATA\AdventureWorks2022_log.ldf- The log file should be placed on a different drive than the data file, ideally a low latency drive such as an SSD or PCIe storage, for maximum performance.
Example T-SQL script:
RESTORE DATABASE [AdventureWorks2022] FROM DISK = N'C:\temp\AdventureWorks2022.bak' WITH FILE = 1, MOVE N'AdventureWorks2022_Data' TO N'h:\DATA\AdventureWorks2022_Data.mdf', MOVE N'AdventureWorks2022_Log' TO N'i:\DATA\AdventureWorks2022_log.ldf', MOVE N'AdventureWorks2022_mod' TO N'h:\data\AdventureWorks2022_mod' GOTo view the sample scripts and workload, unpack the file
SQLServer2016Samples.zipto a local folder. Consult the fileIn-Memory OLTP\readme.txtfor instructions on running the workload.
Description of the sample tables and procedures
The sample creates new tables for products and sales orders, based on existing tables in AdventureWorks2022. The schema of the new tables is similar to the existing tables, with a few differences, as explained later in this section.
The new memory-optimized tables carry the suffix _inmem. The sample also includes corresponding tables carrying the suffix _ondisk - these tables can be used to make a one-to-one comparison between the performance of memory-optimized tables and disk-based tables on your system.
The memory-optimized tables used in the workload for performance comparison are fully durable and fully logged. They don't sacrifice durability or reliability to attain the performance gain.
The target workload for this sample is sales order processing, where we consider also information about products and discounts. To this end, we use the tables SalesOrderHeader, SalesOrderDetail, Product, SpecialOffer, and SpecialOfferProduct.
Two new stored procedures, Sales.usp_InsertSalesOrder_inmem and Sales.usp_UpdateSalesOrderShipInfo_inmem, are used to insert sales orders and to update the shipping information of a given sales order.
The new schema Demo contains helper tables and stored procedures to execute a demo workload.
Concretely, the In-Memory OLTP sample adds the following objects to AdventureWorks2022:
Tables added by the sample
The new tables
Sales.SalesOrderHeader_inmem
- Header information about sales orders. Each sales order has one row in this table.
Sales.SalesOrderDetail_inmem
- Details of sales orders. Each line item of a sales order has one row in this table.
Sales.SpecialOffer_inmem
- Information about special offers, including the discount percentage associated with each special offer.
Sales.SpecialOfferProduct_inmem
- Reference table between special offers and products. Each special offer can feature zero or more products, and each product can be featured in zero or more special offers.
Production.Product_inmem
- Information about products, including their list price.
Demo.DemoSalesOrderDetailSeed
- Used in the demo workload to construct sample sales orders.
Disk-based variations of the tables:
Sales.SalesOrderHeader_ondiskSales.SalesOrderDetail_ondiskSales.SpecialOffer_ondiskSales.SpecialOfferProduct_ondiskProduction.Product_ondisk
Differences between original disk-based and the new memory-optimized tables
Usually, the new tables introduced by this sample use the same columns and the same data types as the original tables. However, there are a few differences. We list the differences in this section, along with a rationale for the changes.
Sales.SalesOrderHeader_inmem
Default constraints are supported for memory-optimized tables, and most default constraints we migrated as is. However, the original table
Sales.SalesOrderHeadercontains two default constraints that retrieve the current date, for the columnsOrderDateandModifiedDate. In a high throughput order processing workload with much concurrency, any global resource can become a point of contention. System time is such a global resource, and we have observed that it can become a bottleneck when running an In-Memory OLTP workload that inserts sales orders, in particular if the system time needs to be retrieved for multiple columns in the sales order header, and the sales order details. The problem is addressed in this sample by retrieving the system time only once for each sales order that is inserted, and use that value for the datetime columns inSalesOrderHeader_inmemandSalesOrderDetail_inmem, in the stored procedureSales.usp_InsertSalesOrder_inmem.Alias user-defined data types (UDTs) - The original table uses two alias UDTs
dbo.OrderNumberanddbo.AccountNumber, for the columnsPurchaseOrderNumberandAccountNumber, respectively. SQL Server 2016 (13.x) doesn't support alias UDT for memory-optimized tables, thus the new tables use system data types nvarchar(25) and nvarchar(15), respectively.Nullable columns in index keys - In the original table, the column
SalesPersonIDis nullable, while in the new tables the column isn't nullable and has a default constraint with value (-1). This circumstance is because indexes on memory-optimized tables can't have nullable columns in the index key; -1 is a surrogate for NULL in this case.Computed columns - The computed columns
SalesOrderNumberandTotalDueare omitted, as SQL Server 2016 (13.x) doesn't support computed columns in memory-optimized tables. The new viewSales.vSalesOrderHeader_extended_inmemreflects the columnsSalesOrderNumberandTotalDue. Therefore, you can use this view if these columns are needed.- Applies to: SQL Server 2017 (14.x). Beginning with SQL Server 2017 (14.x), computed columns are supported in memory-optimized tables and indexes.
Foreign key constraints are supported for memory-optimized tables in SQL Server 2016 (13.x), but only if the referenced tables are also memory-optimized. Foreign keys that reference tables that are also migrated to memory-optimized are kept in the migrated tables, while other foreign keys are omitted. In addition,
SalesOrderHeader_inmemis a hot table in the example workload, and foreign keys constraints require extra processing for all DML operations, as it requires lookups in all the other tables referenced in these constraints. Therefore, the assumption is that the app ensures referential integrity for theSales.SalesOrderHeader_inmemtable, and referential integrity isn't validated when rows are inserted.Rowguid - The rowguid column is omitted. While uniqueidentifier is supported for memory-optimized tables, the option ROWGUIDCOL isn't supported in SQL Server 2016 (13.x). Columns of this kind are typically used for either merge replication or tables that have filestream columns. This sample includes neither.
Sales.SalesOrderDetail
Default constraints - similar to
SalesOrderHeader, the default constraint requiring the system date/time isn't migrated. Instead, the stored procedure inserting sales orders takes care of inserting the current system date/time on first insert.Computed columns - the computed column
LineTotalwasn't migrated as computed columns aren't supported with memory-optimized tables in SQL Server 2016 (13.x). To access this column, use theSales.vSalesOrderDetail_extended_inmemview.Rowguid - The
rowguidcolumn is omitted. For details see the description for the tableSalesOrderHeader.
Production.Product
Alias UDTs - the original table uses the user-defined data type
dbo.Flag, which is equivalent to the system data type bit. The migrated table uses the bit data type instead.Rowguid - The
rowguidcolumn is omitted. For details see the description for the tableSalesOrderHeader.
Sales.SpecialOffer
- Rowguid - The
rowguidcolumn is omitted. For details see the description for the tableSalesOrderHeader.
Sales.SpecialOfferProduct
- Rowguid - The
rowguidcolumn is omitted. For details see the description for the tableSalesOrderHeader.
Considerations for indexes on memory-optimized tables
The baseline index for memory-optimized tables is the NONCLUSTERED index, which supports point lookups (index seek on equality predicate), range scans (index seek in inequality predicate), full index scans, and ordered scans. In addition, NONCLUSTERED indexes support searching on leading columns of the index key. In fact memory-optimized NONCLUSTERED indexes support all the operations supported by disk-based NONCLUSTERED indexes, with the only exception being backward scans. Therefore, using NONCLUSTERED indexes is a safe choice for your indexes.
HASH indexes can be used to further optimize the workload. They are optimized for point lookups and row inserts. However, one must consider that they don't support range scans, ordered scans, or search on leading index key columns. Therefore, care needs to be taken when using these indexes. In addition, it's necessary to specify the bucket_count at create time. It should usually be set at between one and two times the number of index key values, but overestimating is usually not a problem.
For more information:
- Guidelines for online index operations
- Choosing the right bucket_count
- Indexes on Memory-Optimized Tables
The indexes on the migrated tables were tuned for the demo sales order processing workload. The workload relies on inserts and point lookups in the tables Sales.SalesOrderHeader_inmem and Sales.SalesOrderDetail_inmem, and it also relies on point lookups on the primary key columns in the tables Production.Product_inmem and Sales.SpecialOffer_inmem.
Sales.SalesOrderHeader_inmem has three indexes, which are all HASH indexes for performance reasons, and because no ordered or range scans are needed for the workload.
HASH index on (
SalesOrderID): bucket_count is sized at 10 million (rounded up to 16 million), because the expected number of sales orders is 10 millionHASH index on (
SalesPersonID): bucket_count is 1 million. The data set provided doesn't have many sales persons. But this large bucket_count allows for future growth. Plus you don't pay a performance penalty for point lookups if the bucket_count is oversized.HASH index on (
CustomerID): bucket_count is 1 million. The data set provided doesn't have a lot of customers, but this allows for future growth.
Sales.SalesOrderDetail_inmem has three indexes, which are all HASH indexes for performance reasons, and because no ordered or range scans are needed for the workload.
HASH index on (
SalesOrderID,SalesOrderDetailID): this is the primary key index, and even though lookups on (SalesOrderID,SalesOrderDetailID) are infrequent, using a hash index for the key speeds up row inserts. The bucket_count is sized at 50 million (rounded up to 67 million): the expected number of sales orders is 10 million, and this is sized to have an average of five items per orderHASH index on (
SalesOrderID): lookups by sales order are frequent: you want to find all the line items corresponding to a single order. bucket_count is sized at 10 million (rounded up to 16 million), because the expected number of sales orders is 10 millionHASH index on (
ProductID): bucket_count is 1 million. The data set provided doesn't have a lot of products, but this allows for future growth.
Production.Product_inmem has three indexes
HASH index on (
ProductID): lookups onProductIDare in the critical path for the demo workload, therefore this is a hash indexNONCLUSTERED index on (
Name): this allows ordered scans of product namesNONCLUSTERED index on (
ProductNumber): this allows ordered scans of product numbers
Sales.SpecialOffer_inmem has one HASH index on (SpecialOfferID): point lookups of special offers are in the critical part of the demo workload. The bucket_count is sized at 1 million to allow for future growth.
Sales.SpecialOfferProduct_inmem isn't referenced in the demo workload, and thus there's no apparent need to use hash indexes on this table to optimize the workload - the indexes on (SpecialOfferID, ProductID) and (ProductID) are NONCLUSTERED.
In the previous example, some of the bucket counts are oversized, but not the bucket counts for the indexes on SalesOrderHeader_inmem and SalesOrderDetail_inmem: they're sized for just 10 million sales orders. This was done to allow installing the sample on systems with low memory availability, although in those cases the demo workload fails with an out-of-memory error. If you do want to scale well beyond 10 million sales orders, feel free to increase the bucket counts accordingly.
Considerations for memory utilization
Memory utilization in the sample database, both before and after running the demo workload, is discussed in the Section Memory utilization for the memory-optimized tables.
Stored procedures added by the sample
The two key stored procedures for inserting sales order and updating shipping details are as follows:
Sales.usp_InsertSalesOrder_inmemInserts a new sales order in the database and outputs the
SalesOrderIDfor that sales order. As input parameters it takes details for the sales order header, and the line items in the order.Output parameter:
- @SalesOrderID int - the
SalesOrderIDfor the sales order that was just inserted
- @SalesOrderID int - the
Input parameters (required):
- @DueDate datetime2
- @CustomerID int
- @BillToAddressID int
- @ShipToAddressID int
- @ShipMethodID int
- @SalesOrderDetails
Sales.SalesOrderDetailType_inmem- table-valued parameter (TVP) that contains the line items of the order
Input parameters (optional):
- @Status tinyint
- @OnlineOrderFlag bit
- @PurchaseOrderNumber nvarchar(25)
- @AccountNumber nvarchar(15)
- @SalesPersonID int
- @TerritoryID int
- @CreditCardID int
- @CreditCardApprovalCode varchar(15)
- @CurrencyRateID int
- @Comment nvarchar(128)
Sales.usp_UpdateSalesOrderShipInfo_inmemUpdate the shipping information for a given sales order. This also updates the shipping information for all line items of the sales order.
This is a wrapper procedure for the natively compiled stored procedures
Sales.usp_UpdateSalesOrderShipInfo_nativewith retry logic to deal with (unexpected) potential conflicts with concurrent transactions updating the same order. For more information, see retry logic.
Sales.usp_UpdateSalesOrderShipInfo_native- This is the natively compiled stored procedure that actually processes the update to the shipping information. It's meant to be called from the wrapper stored procedure
Sales.usp_UpdateSalesOrderShipInfo_inmem. If the client can deal with failures and implements retry logic, you can call this procedure directly, rather than using the wrapper stored procedure.
- This is the natively compiled stored procedure that actually processes the update to the shipping information. It's meant to be called from the wrapper stored procedure
The following stored procedure is used for the demo workload.
Demo.usp_DemoReset- Resets the demo by emptying and reseeding the
SalesOrderHeaderandSalesOrderDetailtables.
- Resets the demo by emptying and reseeding the
The following stored procedures are used for inserting in and deleting from memory-optimized tables while guaranteeing domain and referential integrity.
Production.usp_InsertProduct_inmemProduction.usp_DeleteProduct_inmemSales.usp_InsertSpecialOffer_inmemSales.usp_DeleteSpecialOffer_inmemSales.usp_InsertSpecialOfferProduct_inmem
Finally the following stored procedure is used to verify domain and referential integrity.
dbo.usp_ValidateIntegrityOptional parameter: @object_id - ID of the object to validate integrity for
This procedure relies on the tables
dbo.DomainIntegrity,dbo.ReferentialIntegrity, anddbo.UniqueIntegrityfor the integrity rules that need to be verified - the sample populates these tables based on the check, foreign key, and unique constraints that exist for the original tables in theAdventureWorks2022database.It relies on the helper procedures
dbo.usp_GenerateCKCheck,dbo.usp_GenerateFKCheck, anddbo.GenerateUQCheckto generate the T-SQL needed for performing the integrity checks.
Performance measurements using the demo workload
ostress is a command-line tool that was developed by the Microsoft CSS SQL Server support team. This tool can be used to execute queries or run stored procedures in parallel. You can configure the number of threads to run a given T-SQL statement in parallel, and you can specify how many times the statement should be executed on this thread; ostress spins up the threads and execute the statement on all threads in parallel. After execution finishes for all threads, ostress reports the time taken for all threads to finish execution.
Install ostress
ostress is installed as part of the Report Markup Language (RML) Utilities; there's no standalone installation for ostress.
Installation steps:
Download and run the x64 installation package for the RML utilities from the following page: Download RML for SQL Server
If there's a dialog box saying certain files are in use, select 'Continue'
Run ostress
Ostress is run from the command-line prompt. It's most convenient to run the tool from the RML Cmd Prompt, which is installed as part of the RML Utilities.
To open the RML Cmd Prompt, follow these instructions:
In Windows, open the start menu by selecting the Windows key, and type rml. Select RML Cmd Prompt, which is in the list of search results.
Ensure that the command prompt is located in the RML Utilities installation folder.
The command-line options for ostress can be seen when simply running ostress.exe without any command-line options. The main options to consider for running ostress with this sample are as follows:
| Option | Description |
|---|---|
-S |
Name of SQL Server instance to connect to. |
-E |
Use Windows authentication to connect (default); if you use SQL Server authentication, use the options -U and -P to specify the username and password, respectively. |
-d |
Name of the database, for this example AdventureWorks2022. |
-Q |
The T-SQL statement to be executed. |
-n |
Number of connections processing each input file/query. |
-r |
The number of iterations for each connection to execute each input file/query. |
Demo workload
The main stored procedure used in the demo workload is Sales.usp_InsertSalesOrder_inmem/ondisk. The script in the following example constructs a table-valued parameter (TVP) with sample data, and calls the procedure to insert a sales order with five line items.
The ostress tool is used to execute the stored procedure calls in parallel, to simulate clients inserting sales orders concurrently.
Reset the demo after each stress run executing Demo.usp_DemoReset. This procedure deletes the rows in the memory-optimized tables, truncates the disk-based tables, and executes a database checkpoint.
The following script is executed concurrently to simulate a sales order processing workload:
DECLARE @i AS INT = 0, @od AS Sales.SalesOrderDetailType_inmem, @SalesOrderID AS INT, @DueDate AS DATETIME2 = sysdatetime(), @CustomerID AS INT = RAND() * 8000, @BillToAddressID AS INT = RAND() * 10000, @ShipToAddressID AS INT = RAND() * 10000, @ShipMethodID AS INT = (RAND() * 5) + 1;
INSERT INTO @od
SELECT OrderQty,
ProductID,
SpecialOfferID
FROM Demo.DemoSalesOrderDetailSeed
WHERE OrderID = CAST ((RAND() * 106) + 1 AS INT);
WHILE (@i < 20)
BEGIN
EXECUTE Sales.usp_InsertSalesOrder_inmem
@SalesOrderID OUTPUT,
@DueDate,
@CustomerID,
@BillToAddressID,
@ShipToAddressID,
@ShipMethodID,
@od;
SET @i + = 1;
END
With this script, each sample order that is constructed is inserted 20 times, through 20 stored procedures executed in a WHILE loop. The loop is used to account for the fact that the database is used to construct the sample order. In typical production environments, the mid-tier application constructs the sales order to be inserted.
The previous script inserts sales orders into memory-optimized tables. The script to insert sales orders into disk-based tables is derived by replacing the two occurrences of _inmem with _ondisk.
We use the ostress tool to execute the scripts using several concurrent connections. We use the parameter -n to control the number of connections, and the parameter r to control how many times the script is executed on each connection.
Run the workload
To test at scale we insert 10 million sales orders, using 100 connections. This test performs reasonably on a modest server (for example, 8 physical, 16 logical cores), and basic SSD storage for the log. If the test doesn't perform well on your hardware, take look at the section Troubleshoot slow-running tests. If you want to reduce the level of stress for this test, lower the number of connections by changing the parameter -n. For example to lower the connection count to 40, change the parameter -n100 to -n40.
As a performance measure for the workload we use the elapsed time as reported by ostress.exe after running the workload.
The following instructions and measurements use a workload that inserts 10 million sales orders. For instructions to run a scaled-down workload inserting 1 million sales orders, see the instructions in In-Memory OLTP\readme.txt that is part of the SQLServer2016Samples.zip archive.
Memory-optimized tables
We start by running the workload on memory-optimized tables. The following command opens 100 threads, each running for 5,000 iterations. Each iteration inserts 20 sales orders in separate transactions. There are 20 inserts per iteration to compensate for the fact that the database is used to generate the data to be inserted. This yield a total of 20 * 5,000 * 100 = 10,000,000 sales order inserts.
Open the RML Cmd Prompt, and execute the following command:
Select the Copy button to copy the command, and paste it into the RML Utilities command prompt.
ostress.exe -n100 -r5000 -S. -E -dAdventureWorks2022 -q -Q"DECLARE @i AS INT = 0, @od AS Sales.SalesOrderDetailType_inmem, @SalesOrderID AS INT, @DueDate AS DATETIME2 = SYSDATETIME(), @CustomerID AS INT = RAND() * 8000, @BillToAddressID AS INT = RAND() * 10000, @ShipToAddressID AS INT = RAND() * 10000, @ShipMethodID AS INT = (RAND() * 5) + 1; INSERT INTO @od SELECT OrderQty, ProductID, SpecialOfferID FROM Demo.DemoSalesOrderDetailSeed WHERE OrderID = CAST ((RAND() * 106) + 1 AS INT); WHILE (@i < 20) BEGIN EXECUTE Sales.usp_InsertSalesOrder_inmem @SalesOrderID OUTPUT, @DueDate, @CustomerID, @BillToAddressID, @ShipToAddressID, @ShipMethodID, @od; SET @i + = 1; END"
On one test server with a total number of 8 physical (16 logical) cores, this took 2 minutes and 5 seconds. On a second test server with 24 physical (48 logical) cores, this took 1 minute and 0 seconds.
Observe the CPU utilization while the workload is running, for example using task manager. You see that CPU utilization is close to 100%. If this isn't the case, you have a log IO bottleneck see also Troubleshoot slow-running tests.
Disk-based tables
The following command runs the workload on disk-based tables. This workload might take a while to execute, which is largely due to latch contention in the system. Memory-optimized tables are latch-free and thus don't suffer from this problem.
Open the RML Cmd Prompt, and execute the following command:
Select the Copy button to copy the command, and paste it into the RML Utilities command prompt.
ostress.exe -n100 -r5000 -S. -E -dAdventureWorks2022 -q -Q"DECLARE @i AS INT = 0, @od AS Sales.SalesOrderDetailType_ondisk, @SalesOrderID AS INT, @DueDate AS DATETIME2 = sysdatetime(), @CustomerID AS INT = RAND() * 8000, @BillToAddressID AS INT = RAND() * 10000, @ShipToAddressID AS INT = RAND() * 10000, @ShipMethodID AS INT = (RAND() * 5) + 1; INSERT INTO @od SELECT OrderQty, ProductID, SpecialOfferID FROM Demo.DemoSalesOrderDetailSeed WHERE OrderID = CAST ((RAND() * 106) + 1 AS INT); WHILE (@i < 20) BEGIN EXECUTE Sales.usp_InsertSalesOrder_ondisk @SalesOrderID OUTPUT, @DueDate, @CustomerID, @BillToAddressID, @ShipToAddressID, @ShipMethodID, @od; SET @i + = 1; END"
On one test server with a total number of 8 physical (16 logical) cores, this took 41 minutes and 25 seconds. On a second test server with 24 physical (48 logical) cores, this took 52 minutes and 16 seconds.
The main factor in the performance difference between memory-optimized tables and disk-based tables in this test, is that when using disk-based tables, SQL Server can't fully utilize the CPU. The reason is latch contention: concurrent transactions are attempting to write to the same data page; latches are used to ensure only one transaction at a time can write to a page. The In-Memory OLTP engine is latch-free, and data rows aren't organized in pages. Thus, concurrent transactions don't block each other's inserts, thus enabling SQL Server to fully utilize the CPU.
You can observe the CPU utilization while the workload is running, for example using task manager. You see with disk-based tables the CPU utilization is far from 100%. On a test configuration with 16 logical processors, the utilization would hover around 24%.
Optionally, you can view the number of latch waits per second using Performance Monitor, with the performance counter \SQL Server:Latches\Latch Waits/sec.
Reset the demo
To reset the demo, open the RML Cmd Prompt, and execute the following command:
ostress.exe -S. -E -dAdventureWorks2022 -Q"EXEC Demo.usp_DemoReset"
Depending on the hardware, this might take a few minutes to run.
We recommend a reset after every demo run. Because this workload is insert-only, each run consumes more memory, and thus a reset is required to prevent running out of memory. The amount of memory consumed after a run is discussed in Section Memory utilization after running the workload.
Troubleshoot slow-running tests
Test results typically vary with hardware, and also the level of concurrency used in the test run. A couple of things to look for if the results aren't as expected:
Number of concurrent transactions: When running the workload on a single thread, performance gain with In-Memory OLTP is likely less than 2X. Latch contention is only a significant problem if there's a high level of concurrency.
Low number of cores available to SQL Server: This means there's a low level of concurrency in the system, as there can only be as many concurrently executing transactions as there are cores available to SQL.
- Symptom: if the CPU utilization is high when running the workload on disk-based tables, this means there isn't a lot of contention, pointing to a lack of concurrency.
Speed of the log drive: If the log drive can't keep up with the level of transaction throughput in the system, the workload becomes bottlenecked on log IO. Although logging is more efficient with In-Memory OLTP, if log IO is a bottleneck, the potential performance gain is limited.
- Symptom: if the CPU utilization isn't close to 100% or is very spiky when running the workload on memory-optimized tables, it's possible there's a log IO bottleneck. This can be confirmed by opening Resource Monitor and looking at the queue length for the log drive.
Memory and disk space utilization in the sample
In the following example, we describe what to expect in terms of memory and disk space utilization for the sample database. We also show the results from on a test server with 16 logical cores.
Memory utilization for the memory-optimized tables
Overall utilization of the database
The following query can be used to obtain the total memory utilization for In-Memory OLTP in the system.
SELECT type,
name,
pages_kb / 1024 AS pages_MB
FROM sys.dm_os_memory_clerks
WHERE type LIKE '%xtp%';
Snapshot after the database has just been created:
| type | name | pages_MB |
|---|---|---|
| MEMORYCLERK_XTP | Default | 94 |
| MEMORYCLERK_XTP | DB_ID_5 | 877 |
| MEMORYCLERK_XTP | Default | 0 |
| MEMORYCLERK_XTP | Default | 0 |
The default memory clerks contain system-wide memory structures and are relatively small. The memory clerk for the user database, in this case database with ID 5 (the database_id might differ in your instance), is about 900 MB.
Memory utilization per table
The following query can be used to drill down into the memory utilization of the individual tables and their indexes:
SELECT object_name(t.object_id) AS [Table name],
memory_allocated_for_table_kb,
memory_allocated_for_indexes_kb
FROM sys.dm_db_xtp_table_memory_stats AS dms
INNER JOIN sys.tables AS t
ON dms.object_id = t.object_id
WHERE t.type = 'U';
The following table displays the results of this query for a fresh installation of the sample:
| Table name | memory_allocated_for_table_kb |
memory_allocated_for_indexes_kb |
|---|---|---|
SpecialOfferProduct_inmem |
64 | 3840 |
DemoSalesOrderHeaderSeed |
1984 | 5504 |
SalesOrderDetail_inmem |
15316 | 663552 |
DemoSalesOrderDetailSeed |
64 | 10432 |
SpecialOffer_inmem |
3 | 8192 |
SalesOrderHeader_inmem |
7168 | 147456 |
Product_inmem |
124 | 12352 |
As you can see, the tables are fairly small: SalesOrderHeader_inmem is about 7 MB, and SalesOrderDetail_inmem is about 15 MB in size.
What is striking here is the size of the memory allocated for indexes, compared to the size of the table data. That is because the hash indexes in the sample are presized for a larger data size. Hash indexes have a fixed size, and thus their size doesn't grow with the size of data in the table.
Memory utilization after running the workload
After insert 10 million sales orders, the all-up memory utilization looks similar to the following query:
SELECT type,
name,
pages_kb / 1024 AS pages_MB
FROM sys.dm_os_memory_clerks
WHERE type LIKE '%xtp%';
Here's the result set.
type |
name |
pages_MB |
|---|---|---|
| MEMORYCLERK_XTP | Default | 146 |
| MEMORYCLERK_XTP | DB_ID_5 | 7374 |
| MEMORYCLERK_XTP | Default | 0 |
| MEMORYCLERK_XTP | Default | 0 |
As you can see, SQL Server is using a bit under 8 GB for the memory-optimized tables and indexes in the sample database.
Looking at the detailed memory usage per table after one example run:
SELECT object_name(t.object_id) AS [Table name],
memory_allocated_for_table_kb,
memory_allocated_for_indexes_kb
FROM sys.dm_db_xtp_table_memory_stats AS dms
INNER JOIN sys.tables AS t
ON dms.object_id = t.object_id
WHERE t.type = 'U';
Here's the result set.
Table name |
memory_allocated_for_table_kb |
memory_allocated_for_indexes_kb |
|---|---|---|
| SalesOrderDetail_inmem | 5113761 | 663552 |
| DemoSalesOrderDetailSeed | 64 | 10368 |
| SpecialOffer_inmem | 2 | 8192 |
| SalesOrderHeader_inmem | 1575679 | 147456 |
| Product_inmem | 111 | 12032 |
| SpecialOfferProduct_inmem | 64 | 3712 |
| DemoSalesOrderHeaderSeed | 1984 | 5504 |
We can see a total of about 6.5 GB of data. The size of the indexes on the table SalesOrderHeader_inmem and SalesOrderDetail_inmem is the same as the size of the indexes before inserting the sales orders. The index size didn't change because both tables are using hash indexes, and hash indexes are static.
After demo reset
The stored procedure Demo.usp_DemoReset can be used to reset the demo. It deletes the data in the tables SalesOrderHeader_inmem and SalesOrderDetail_inmem, and reseeds the data from the original tables SalesOrderHeader and SalesOrderDetail.
Now, even though the rows in the tables were deleted, this doesn't mean that memory is reclaimed immediately. SQL Server reclaims memory from deleted rows in memory-optimized tables in the background, as needed. You see that immediately after demo reset, with no transactional workload on the system, memory from deleted rows isn't yet reclaimed:
SELECT type,
name,
pages_kb / 1024 AS pages_MB
FROM sys.dm_os_memory_clerks
WHERE type LIKE '%xtp%';
Here's the result set.
type |
name |
pages_MB |
|---|---|---|
| MEMORYCLERK_XTP | Default | 2261 |
| MEMORYCLERK_XTP | DB_ID_5 | 7396 |
| MEMORYCLERK_XTP | Default | 0 |
| MEMORYCLERK_XTP | Default | 0 |
This is expected: memory is reclaimed when the transactional workload is running.
If you start a second run of the demo workload you see the memory utilization decrease initially, as the previously deleted rows are cleaned up. At some point, the memory size increases again until the workload finishes. After inserting 10 million rows after demo reset, the memory utilization is very similar to the utilization after the first run. For example:
SELECT type,
name,
pages_kb / 1024 AS pages_MB
FROM sys.dm_os_memory_clerks
WHERE type LIKE '%xtp%';
Here's the result set.
type |
name |
pages_MB |
|---|---|---|
| MEMORYCLERK_XTP | Default | 1863 |
| MEMORYCLERK_XTP | DB_ID_5 | 7390 |
| MEMORYCLERK_XTP | Default | 0 |
| MEMORYCLERK_XTP | Default | 0 |
Disk utilization for memory-optimized tables
The overall on-disk size for the checkpoint files of a database at a given time can be found using the query:
SELECT SUM(df.size) * 8 / 1024 AS [On-disk size in MB]
FROM sys.filegroups AS f
INNER JOIN sys.database_files AS df
ON f.data_space_id = df.data_space_id
WHERE f.type = N'FX';
Initial state
When the sample filegroup and sample memory-optimized tables are created initially, several checkpoint files are precreated and the system starts filling the files - the number of checkpoint files precreated depends on the number of logical processors in the system. As the sample is initially very small, the precreated files are mostly empty after initial create.
The following code shows the initial on-disk size for the sample on a machine with 16 logical processors:
SELECT SUM(df.size) * 8 / 1024 AS [On-disk size in MB]
FROM sys.filegroups AS f
INNER JOIN sys.database_files AS df
ON f.data_space_id = df.data_space_id
WHERE f.type = N'FX';
Here's the result set.
| On-disk size in MB |
|---|
| 2312 |
As you can see, there's a big discrepancy between the on-disk size of the checkpoint files, which is 2.3 GB, and the actual data size, which is closer to 30 MB.
Looking closer at where the disk-space utilization comes from, you can use the following query. The size on disk returned by this query is approximate for files with state in 5 (REQUIRED FOR BACKUP/HA), 6 (IN TRANSITION TO TOMBSTONE), or 7 (TOMBSTONE).
SELECT state_desc,
file_type_desc,
COUNT(*) AS [count],
SUM(CASE WHEN state = 5 AND file_type = 0 THEN 128 * 1024 * 1024
WHEN state = 5 AND file_type = 1 THEN 8 * 1024 * 1024
WHEN state IN (6, 7) THEN 68 * 1024 * 1024
ELSE file_size_in_bytes END) / 1024 / 1024 AS [on-disk size MB]
FROM sys.dm_db_xtp_checkpoint_files
GROUP BY state, state_desc, file_type, file_type_desc
ORDER BY state, file_type;
For the initial state of the sample, the result looks something like the following table for a server with 16 logical processors:
| state_desc | file_type_desc | count | on-disk size MB |
|---|---|---|---|
| PRECREATED | DATA | 16 | 2048 |
| PRECREATED | DELTA | 16 | 128 |
| UNDER CONSTRUCTION | DATA | 1 | 128 |
| UNDER CONSTRUCTION | DELTA | 1 | 8 |
As you can see, most of the space is used by precreated data and delta files. SQL Server precreated one pair of (data, delta) files per logical processor. In addition, data files are presized at 128 MB, and delta files at 8 MB, in order to make inserting data into these files more efficient.
The actual data in the memory-optimized tables is in the single data file.
After running the workload
After a single test run that inserts 10 million sales orders, the overall on-disk size looks something like this (for a 16-core test server):
SELECT SUM(df.size) * 8 / 1024 AS [On-disk size in MB]
FROM sys.filegroups AS f
INNER JOIN sys.database_files AS df
ON f.data_space_id = df.data_space_id
WHERE f.type = N'FX';
Here's the result set.
| On-disk size in MB |
|---|
| 8828 |
The on-disk size is close to 9 GB, which comes close to the in-memory size of the data.
Looking more closely at the sizes of the checkpoint files across the various states:
SELECT state_desc,
file_type_desc,
COUNT(*) AS [count],
SUM(CASE WHEN state = 5 AND file_type = 0 THEN 128 * 1024 * 1024
WHEN state = 5 AND file_type = 1 THEN 8 * 1024 * 1024
WHEN state IN (6, 7) THEN 68 * 1024 * 1024
ELSE file_size_in_bytes END) / 1024 / 1024 AS [on-disk size MB]
FROM sys.dm_db_xtp_checkpoint_files
GROUP BY state, state_desc, file_type, file_type_desc
ORDER BY state, file_type;
Here's the result set.
state_desc |
file_type_desc |
count |
on-disk size MB |
|---|---|---|---|
| PRECREATED | DATA | 16 | 2048 |
| PRECREATED | DELTA | 16 | 128 |
| UNDER CONSTRUCTION | DATA | 1 | 128 |
| UNDER CONSTRUCTION | DELTA | 1 | 8 |
We still have 16 pairs of precreated files, ready to go as checkpoints are closed.
There's one pair under construction, which is used until the current checkpoint is closed. Along with the active checkpoint files this gives about 6.5 GB of disk utilization for 6.5 GB of data in memory. Recall that indexes aren't persisted on disk, and thus the overall size on disk is smaller than the size in memory in this case.
After demo reset
After demo reset, disk space isn't reclaimed immediately if there's no transactional workload on the system, and there aren't database checkpoints. For checkpoint files to be moved through their various stages and eventually be discarded, several checkpoints and log truncation events need to happen, to initiate merge of checkpoint files, as well as to initiate garbage collection. These happen automatically if you have a transactional workload in the system (and take regular log backups, in case you're using the FULL recovery model), but not when the system is idle, as in a demo scenario.
In the example, after demo reset, you might see something like:
SELECT SUM(df.size) * 8 / 1024 AS [On-disk size in MB]
FROM sys.filegroups AS f
INNER JOIN sys.database_files AS df
ON f.data_space_id = df.data_space_id
WHERE f.type = N'FX';
Here's the result set.
| On-disk size in MB |
|---|
| 11839 |
At nearly 12 GB, this is significantly more than the 9 GB we had before the demo reset. This is because some checkpoint file merges were started, but some of the merge targets haven't yet been installed, and some of the merge source files haven't yet been cleaned up, as can be seen from the following example:
SELECT state_desc,
file_type_desc,
COUNT(*) AS [count],
SUM(CASE WHEN state = 5 AND file_type = 0 THEN 128 * 1024 * 1024
WHEN state = 5 AND file_type = 1 THEN 8 * 1024 * 1024
WHEN state IN (6, 7) THEN 68 * 1024 * 1024
ELSE file_size_in_bytes END) / 1024 / 1024 AS [on-disk size MB]
FROM sys.dm_db_xtp_checkpoint_files
GROUP BY state, state_desc, file_type, file_type_desc
ORDER BY state, file_type;
Here's the result set.
state_desc |
file_type_desc |
count |
on-disk size MB |
|---|---|---|---|
| PRECREATED | DATA | 16 | 2048 |
| PRECREATED | DELTA | 16 | 128 |
| ACTIVE | DATA | 38 | 5152 |
| ACTIVE | DELTA | 38 | 1331 |
| MERGE TARGET | DATA | 7 | 896 |
| MERGE TARGET | DELTA | 7 | 56 |
| MERGED SOURCE | DATA | 13 | 1772 |
| MERGED SOURCE | DELTA | 13 | 455 |
Merge targets are installed and merged source are cleaned up as transactional activity happens in the system.
After a second run of the demo workload, inserting 10 million sales orders after the demo reset, you see that the files constructed during the first run of the workload were cleaned up. If you run the previous query several times while the workload is running, you can see the checkpoint files make their way through the various stages.
After the second run of the workload insert 10 million sales orders you see disk utilization very similar to, though not necessarily the same as after the first run, as the system is dynamic in nature. For example:
SELECT state_desc,
file_type_desc,
COUNT(*) AS [count],
SUM(CASE WHEN state = 5 AND file_type = 0 THEN 128 * 1024 * 1024
WHEN state = 5 AND file_type = 1 THEN 8 * 1024 * 1024
WHEN state IN (6, 7) THEN 68 * 1024 * 1024
ELSE file_size_in_bytes END) / 1024 / 1024 AS [on-disk size MB]
FROM sys.dm_db_xtp_checkpoint_files
GROUP BY state, state_desc, file_type, file_type_desc
ORDER BY state, file_type;
Here's the result set.
state_desc |
file_type_desc |
count |
on-disk size MB |
|---|---|---|---|
| PRECREATED | DATA | 16 | 2048 |
| PRECREATED | DELTA | 16 | 128 |
| UNDER CONSTRUCTION | DATA | 2 | 268 |
| UNDER CONSTRUCTION | DELTA | 2 | 16 |
| ACTIVE | DATA | 41 | 5608 |
| ACTIVE | DELTA | 41 | 328 |
In this case, there are two checkpoint file pairs in the UNDER CONSTRUCTION state, which means multiple file pairs were moved to the UNDER CONSTRUCTION state, likely due to the high level of concurrency in the workload. Multiple concurrent threads required a new file pair at the same time, and thus moved a pair from PRECREATED to UNDER CONSTRUCTION.