Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- **UPDATE 1 - RU/m**
- I read through the [documentation on Request Units][2], and noticed it talked about RU/s and RU/m.
- Unfortunately it had very little information, and what it did have was very unclear.
- But there was a second [documentation][3] page which then went in to more detail.
- From what I can gather RU/m provides a way to reliably burst about the RU/s you have set.
- So, if you have 400 RU/s you can also have 4000 RU/m. If you exceed your RU/s then you can take RU/m to "top up" the throughput. So an 800 RU/s query would consume 400 RU/s and then 400 RU from the RU/m.
- Since I'm seeing performance which is below my RU/s threshold I didn't think adding RU/m would affect my situation, but I tired it anyway.
- I re-ran my query and got 8728 documents, in 187595.148 ms, using 3917.92 RUs.
- That's even slower than the first run, though I expect that's due to variations in network and other environment conditions between runs.
- (So I have now disabled RU/m again since it does cost money but provides no benefit here.)
- **UPDATE 2 - The Polygon**
- Here's the polygon I'm using in the query:
- Area = new Polygon(new List<LinearRing>()
- {
- new LinearRing(new List<Position>()
- {
- new Position(1.8567 ,51.3814),
- new Position(0.5329 ,51.4618),
- new Position(0.2477 ,51.2588),
- new Position(-0.5329 ,51.2579),
- new Position(-1.17 ,51.2173),
- new Position(-1.9062 ,51.1958),
- new Position(-2.5434 ,51.1614),
- new Position(-3.8672 ,51.139 ),
- new Position(-4.1578 ,50.9137),
- new Position(-4.5373 ,50.694 ),
- new Position(-5.1496 ,50.3282),
- new Position(-5.2212 ,49.9586),
- new Position(-3.7049 ,50.142 ),
- new Position(-2.1698 ,50.314 ),
- new Position(0.4669 ,50.6976),
- new Position(1.8567 ,51.3814)
- })
- })
- I have also tried reversing it (since ring orientation matters), but the query with the reversed polygon took significantly longer (I don't have the time to hand) and returned 91272 items.
- Also, the coordinates are specified as Longitude/Latitude, as [this is how GeoJSON expects them][4] (i.e. as X/Y), rather than the traditional order used when speaking of Latitude/Longitude.
- > The GeoJSON specification specifies longitude first and latitude second.
- **UPDATE 3 - Data Size and Connection Speed**
- The overview on the Azure Portal for my collection is showing 110mb of data.
- Running a speed test on my internet connection shows a consistent 65 Mbit/s.
- So ignoring any other factors (such as how long it takes to collate the data server-side), it should take ~14 s for the entire data set to download in one big chunk.
- Out of interest I tried getting all the documents, and it took 1923070.0002 ms, which is 1923 seconds, or 32 minutes.
- This is the query I used:
- var query = client
- .CreateDocumentQuery<T>(documentCollectionUri)
- .Where(s => s.Type == this.subscriptionType);
- var documents = await this.QueryTrackingUsedRUsAsync(query);
- And I again then passed it to my `QueryTrackingUsedRUsAsync` method for tracking the RUs.
- **UPDATE 4 - Azure Portal Statistics**
- Here's the "Max consumed RU/s per physical partition" graph from just after I'd finished the query just above, which is retrieving all documents without the geospatial filter. They query still contains a check against a property which is an enum. This enum indicates the "type" of the document as in the future more than one "type" of these documents will be stored in the collection.
- [![Azure Portal - Max RUs][5]][5]
- It appears the Max RUs never go above ~80. And that seems to be some kind of hard limit.
- Again, I find this confusing as from what I've read it should be using the full throughput, and from one of the channel 9 videos I watched it said that the documents will be collated and then sent as one big chunk.
- I'm either misunderstanding what should be happening, or I've got some insane client-side code that is screwing things up, or there's something wrong server-side.
- **UPDATE 5 - Non-Spatial Parts**
- There are two things I'm doing during the querying which I haven't yet cast much of a critical eye on.
- The first is the method which counts used RUs. Having that information is very useful, and is very much recommended in the documentation so you can monitor and tune your queries.
- (As an aside: considering that this information is so important, it's a pain in the arse to access using the .Net SDK.)
- The second is that there's also a filter based on an enum property. This enum indicates the "document type", and will be used to differentiate documents in the future. (e.g. the collection contains DocumentAs and DocumentBs, and I only care about As.)
- So I created a new test, similar to my previous test, where all documents are retrieved omitting the spatial query. In addition it also omits the Document Type filter, and it no longer uses the RU tracking.
- Here's my query:
- var query = client.CreateDocumentQuery<T>(documentCollectionUri);
- var documents = query.ToList();
- It took 1802902.6053 ms to get all 100,000 items, which is ~1803 seconds, or ~30 minutes.
- These results seem to indicate that there is an issue with returning large numbers of documents.
- As mentioned above the data is 110mb total, which should take roughly 14 seconds to transmit at the speed of my internet connection.
- There is obviously a very large difference between 14 seconds and 30 minutes.
- **UPDATE 6 - Local VS Cloud**
- As mentioned, I have been running this code on my local network.
- Eventually it will be run in the same data center as the CosmosDB, but during development/testing I was willing to take the extra performance hit.
- From what I understand the main difference between the two environments is the latency of network calls. The bandwidth is less important as it's not very much data.
- From what I've heard in a channel 9 video, the data is collected and merged in to one big chunk for sending.
- If it is indeed sent in a single chunk then the latency differences should be almost negligible as it'll only be a single call.
- If there is a separate call for each of the 100,000 items then the latency would build up a lot more. (100,000 times more heh).
- Running a few tests it seems the latency to the azure data center is roughly 55 ms from my local network, and it should be <10 ms in the data center itself.
- Running it on a VM in the cloud took 1331346.3065 ms, which is ~1331 seconds, or 22 minutes.
- That's better, but it still seems, well, unusable.
- The Azure graphs look the same too, the throughput seems to top out at 80 RU/s.
- **UPDATE 7 - Sample Document**
- Here's the JSON for one of my documents:
- {
- "GeoTrigger": null,
- "SeverityTrigger": -1,
- "TypeTrigger": -1,
- "Name": "13, LONSDALE SQUARE, LONDON, N1 1EN",
- "IsEnabled": true,
- "Type": 2,
- "Location": {
- "$type": "Microsoft.Azure.Documents.Spatial.Point, Microsoft.Azure.Documents.Client",
- "type": "Point",
- "coordinates": [
- -0.1076407397346815,
- 51.53970315059827
- ]
- },
- "id": "0dc2c03e-082b-4aea-93a8-79d89546c12b",
- "_rid": "EQttAMGhSQDWPwAAAAAAAA==",
- "_self": "dbs/EQttAA==/colls/EQttAMGhSQA=/docs/EQttAMGhSQDWPwAAAAAAAA==/",
- "_etag": "\"42001028-0000-0000-0000-594943fe0000\"",
- "_attachments": "attachments/",
- "_ts": 1497973747
- }
- **UPDATE 8 - The Repro Code**
- I decided it would be best if I shared the majority of my code as that may be part of the problem.
- There is one small part I haven't included, which is initially retrieving the addresses for creating the test documents.
- It involves proprietary data and some processing that isn't relevant to the DocumentDB interaction.
- It can be replaced by generating random coordinates within the UK bounds.
- These are some of the execution times from yesterday:
- 1,000 2278.8806ms
- 10,000 17516.9564ms
- 100,000 173262.2865ms
- As these times seem pretty linear, I'm dropping from 100,000 documents to 10,000 to improve the rate at which I can iterate revisions.
- Here's the repro code:
- using System;
- using System.Collections.Generic;
- using System.Configuration;
- using System.Diagnostics;
- using System.Linq;
- using System.Runtime.CompilerServices;
- using System.Threading;
- using System.Threading.Tasks;
- using Microsoft.Azure.Documents;
- using Microsoft.Azure.Documents.Client;
- using Microsoft.Azure.Documents.Spatial;
- namespace Repro.Cli
- {
- public class Program
- {
- static void Main(string[] args)
- {
- //AJ: Init logging
- Trace.AutoFlush = true;
- Trace.Listeners.Add(new ConsoleTraceListener());
- Trace.Listeners.Add(new TextWriterTraceListener("trace.log"));
- //AJ: Increase availible threads
- //AJ: https://docs.microsoft.com/en-us/azure/storage/storage-performance-checklist#subheading10
- //AJ: https://github.com/Azure/azure-documentdb-dotnet/blob/master/samples/documentdb-benchmark/Program.cs
- var minThreadPoolSize = 100;
- ThreadPool.SetMinThreads(minThreadPoolSize, minThreadPoolSize);
- //AJ: https://docs.microsoft.com/en-us/azure/cosmos-db/performance-tips
- //AJ: gcServer enabled in app.config
- //AJ: Prefer 32-bit disabled in project properties
- //AJ: DO IT
- var program = new Program();
- Trace.TraceInformation($"Starting @ {DateTime.UtcNow}");
- program.RunAsync().Wait();
- Trace.TraceInformation($"Finished @ {DateTime.UtcNow}");
- //AJ: Wait for user to exit
- Console.WriteLine();
- Console.WriteLine("Hit enter to exit...");
- Console.ReadLine();
- }
- public async Task RunAsync()
- {
- using (new CodeTimer())
- {
- var client = await this.GetDocumentClientAsync();
- var documentCollectionUri = UriFactory.CreateDocumentCollectionUri(ConfigurationManager.AppSettings["databaseID"], ConfigurationManager.AppSettings["collectionID"]);
- //AJ: Prepare Test Documents
- //var documentCount = 10000; //AJ: 10,000
- //var documentsForUpsert = this.GetDocuments(documentCount);
- //await this.UpsertDocumentsAsync(client, documentCollectionUri, documentsForUpsert);
- var allDocuments = this.GetAllDocuments(client, documentCollectionUri);
- var area = this.GetArea();
- var documentsInArea = this.GetDocumentsInArea(client, documentCollectionUri, area);
- }
- }
- private async Task<DocumentClient> GetDocumentClientAsync()
- {
- using (new CodeTimer())
- {
- var serviceEndpointUri = new Uri(ConfigurationManager.AppSettings["serviceEndpoint"]);
- var authKey = ConfigurationManager.AppSettings["authKey"];
- var connectionPolicy = new ConnectionPolicy
- {
- ConnectionMode = ConnectionMode.Direct,
- ConnectionProtocol = Protocol.Tcp,
- RequestTimeout = new TimeSpan(1, 0, 0),
- RetryOptions = new RetryOptions
- {
- MaxRetryAttemptsOnThrottledRequests = 10,
- MaxRetryWaitTimeInSeconds = 60
- }
- };
- var client = new DocumentClient(serviceEndpointUri, authKey, connectionPolicy);
- await client.OpenAsync();
- return client;
- }
- }
- private List<TestDocument> GetDocuments(int count)
- {
- using (new CodeTimer())
- {
- return External.CreateDocuments(count);
- }
- }
- private async Task UpsertDocumentsAsync(DocumentClient client, Uri documentCollectionUri, List<TestDocument> documents)
- {
- using (new CodeTimer())
- {
- //TODO: AJ: Parallelise
- foreach (var document in documents)
- {
- await client.UpsertDocumentAsync(documentCollectionUri, document);
- }
- }
- }
- private List<TestDocument> GetAllDocuments(DocumentClient client, Uri documentCollectionUri)
- {
- using (new CodeTimer())
- {
- var query = client
- .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- {
- MaxItemCount = 1000
- });
- var documents = query.ToList();
- return documents;
- }
- }
- private Polygon GetArea()
- {
- //AJ: Longitude,Latitude i.e. X/Y
- //AJ: Ring orientation matters
- return new Polygon(new List<LinearRing>()
- {
- new LinearRing(new List<Position>()
- {
- new Position(1.8567 ,51.3814),
- new Position(0.5329 ,51.4618),
- new Position(0.2477 ,51.2588),
- new Position(-0.5329 ,51.2579),
- new Position(-1.17 ,51.2173),
- new Position(-1.9062 ,51.1958),
- new Position(-2.5434 ,51.1614),
- new Position(-3.8672 ,51.139 ),
- new Position(-4.1578 ,50.9137),
- new Position(-4.5373 ,50.694 ),
- new Position(-5.1496 ,50.3282),
- new Position(-5.2212 ,49.9586),
- new Position(-3.7049 ,50.142 ),
- new Position(-2.1698 ,50.314 ),
- new Position(0.4669 ,50.6976),
- //AJ: Last point must be the same as first point
- new Position(1.8567 ,51.3814)
- })
- });
- }
- private List<TestDocument> GetDocumentsInArea(DocumentClient client, Uri documentCollectionUri, Polygon area)
- {
- using (new CodeTimer())
- {
- var query = client
- .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- {
- MaxItemCount = 1000
- })
- .Where(document => document.Location.Intersects(area));
- var documents = query.ToList();
- return documents;
- }
- }
- }
- public class TestDocument : Resource
- {
- public string Name { get; set; }
- public Point Location { get; set; } //AJ: Longitude,Latitude i.e. X/Y
- public TestDocument()
- {
- this.Id = Guid.NewGuid().ToString("N");
- }
- }
- //AJ: This should be "good enough". The times being recorded are seconds or minutes.
- public class CodeTimer : IDisposable
- {
- private Action<TimeSpan> reportFunction;
- private Stopwatch stopwatch = new Stopwatch();
- public CodeTimer([CallerMemberName]string name = "")
- : this((ellapsed) =>
- {
- Trace.TraceInformation($"{name} took {ellapsed}, or {ellapsed.TotalMilliseconds} ms.");
- })
- { }
- public CodeTimer(Action<TimeSpan> report)
- {
- this.reportFunction = report;
- this.stopwatch.Start();
- }
- public void Dispose()
- {
- this.stopwatch.Stop();
- this.reportFunction(this.stopwatch.Elapsed);
- }
- }
- }
- This code had a very interesting result ... **it's faster!**
- Or to put it another way ... it fails at being a reproduction, as it doesn't reproduce the issue.
- I am happy with this result! It means I now have some "working" code, and some "nonworking" code, and I can slowly apply each difference to the working code until it stops working.
- This is the log when running locally:
- Starting @ 22/06/2017 11:02:23
- GetDocumentClientAsync took 00:00:01.9866833, or 1986.6833 ms.
- GetAllDocuments took 00:00:05.6934298, or 5693.4298 ms.
- GetDocumentsInArea took 00:00:00.4367697, or 436.7697 ms.
- RunAsync took 00:00:08.3177466, or 8317.7466 ms.
- Finished @ 22/06/2017 11:02:32
- So that's 6 seconds to return all 10,000 records, and 0.5 seconds to run the spatial query! This is the kind of performance I was hoping for.
- And this is the log when running in the cloud:
- Starting @ 6/22/2017 11:04:17 AM
- GetDocumentClientAsync took 00:00:01.0331417, or 1033.1417 ms.
- GetAllDocuments took 00:00:05.9973366, or 5997.3366 ms.
- GetDocumentsInArea took 00:00:01.7479158, or 1747.9158 ms.
- RunAsync took 00:00:08.7885318, or 8788.5318 ms.
- Finished @ 6/22/2017 11:04:26 AM
- So that's 6 seconds to return all 10,000 records, and 2 seconds to run the spatial query. It's a little odd that the query is slower in the cloud, but I suspect that's due to network/envrionment conditions. The performance is good enough, and way better than it was.
- Much much, better than yesterday.
- Now, to make changes to the working code until I can replicate the poor performance I was seeing yesterday.
- **UPDATE 9 - The Differences**
- I'll run through each difference I can find, after each difference I'll revert the code back to whats in the Repro update. This should show what impact each difference has independantly.
- As there is a limited difference in runtimes between running locally and in the cloud, I'll only run locally for these tests to save time.
- The first difference I noticed, is that while I was creating the `ConnectionPolicy` object in my old code, I was failing to pass it to the `DocumentClient`. An easy mistake to make, but one I suspect will have a drastic impact as the `ConnectionMode` will default to `Gateway`, and the `ConnectionProtocol` to `HTTPS`.
- Code Changes:
- //var client = new DocumentClient(serviceEndpointUri, authKey, connectionPolicy);
- var client = new DocumentClient(serviceEndpointUri, authKey);
- Results:
- Starting @ 22/06/2017 11:09:25
- GetDocumentClientAsync took 00:00:01.3944992, or 1394.4992 ms.
- GetAllDocuments took 00:00:05.0646737, or 5064.6737 ms.
- GetDocumentsInArea took 00:00:01.5874742, or 1587.4742 ms.
- RunAsync took 00:00:08.2558457, or 8255.8457 ms.
- Finished @ 22/06/2017 11:09:34
- 5 seconds / 1.5 seconds.
- So it seems that the `ConnectionPolicy` has little impact for my specific tests, the difference in times is well within the varience seen between runs.
- Next up, I previously wasn't setting the `MaxItemCount` for the `FeedOptions` when executing a query.
- Code Changes:
- //var query = client
- // .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- // {
- // MaxItemCount = 1000
- // });
- var query = client.CreateDocumentQuery<TestDocument>(documentCollectionUri);
- //var query = client
- // .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- // {
- // MaxItemCount = 1000
- // })
- // .Where(document => document.Location.Intersects(area));
- var query = client.CreateDocumentQuery<TestDocument>(documentCollectionUri).Where(document => document.Location.Intersects(area));
- Results:
- Starting @ 22/06/2017 11:12:01
- GetDocumentClientAsync took 00:00:02.2025152, or 2202.5152 ms.
- GetAllDocuments took 00:00:09.4601258, or 9460.1258 ms.
- GetDocumentsInArea took 00:00:01.2358951, or 1235.8951 ms.
- RunAsync took 00:00:13.1350025, or 13135.0025 ms.
- Finished @ 22/06/2017 11:12:14
- 9.5 seconds / 1.5 seconds.
- Returning lots of results took longer, which is to be expected as they're being returning in a larger number of smaller chunks. But it's still not as bad as yesterday.
- Another difference is that I was using some custom `JSONSerializerSettings` to include type information, and the `Location` property was a `Geometry` type containing a `Point`, rather than directly being a `Point`.
- (For this test I had to reinsert the test documents, as the JSON serializer output is different.)
- Code Changes:
- //AJ: Init Serializer
- JsonConvert.DefaultSettings = () =>
- {
- return new JsonSerializerSettings
- {
- TypeNameHandling = TypeNameHandling.Auto
- };
- };
- //public Point Location { get; set; } //AJ: Longitude,Latitude i.e. X/Y
- public Geometry Location { get; set; } //AJ: Longitude,Latitude i.e. X/Y
- Results:
- Starting @ 22/06/2017 11:23:20
- GetDocumentClientAsync took 00:00:01.9610856, or 1961.0856 ms.
- GetDocuments took 00:00:10.4402680, or 10440.268 ms.
- UpsertDocumentsAsync took 00:06:30.6676130, or 390667.613 ms.
- GetAllDocuments took 00:00:06.9084585, or 6908.4585 ms.
- GetDocumentsInArea took 00:00:00.7809068, or 780.9068 ms.
- RunAsync took 00:06:51.0916425, or 411091.6425 ms.
- Finished @ 22/06/2017 11:30:11
- 7 seconds, 1 second.
- It took a lot longer to run, but thats because the inital insert was included. The run time of the queries was still well within normal variences between runs.
- Yet another difference is that my previous document had a few extra properties, and so was slightly larger.
- (For this test I had to reinsert the test documents, as the JSON serializer output is different.)
- Code Changes:
- public enum TestEnum
- {
- Default = 0,
- Foo = 1,
- Bar = 2
- }
- public class TestDocument : Resource
- {
- public object GeoTrigger { get; set; }
- public TestEnum SeverityTrigger { get; set; }
- public TestEnum TypeTrigger { get; set; }
- public TestEnum Type { get; set; } = TestEnum.Foo;
- public bool IsEnabled { get; set; } = true;
- public string Name { get; set; }
- public Point Location { get; set; } //AJ: Longitude,Latitude i.e. X/Y
- public TestDocument()
- {
- this.Id = Guid.NewGuid().ToString("N");
- }
- }
- Results:
- Starting @ 22/06/2017 11:41:58
- GetDocumentClientAsync took 00:00:01.8769841, or 1876.9841 ms.
- GetDocuments took 00:00:10.7117590, or 10711.759 ms.
- UpsertDocumentsAsync took 00:06:42.9010966, or 402901.0966 ms.
- GetAllDocuments took 00:00:07.7120264, or 7712.0264 ms.
- GetDocumentsInArea took 00:00:00.9173274, or 917.3274 ms.
- RunAsync took 00:07:04.5086401, or 424508.6401 ms.
- Finished @ 22/06/2017 11:49:03
- 8 seconds / 1 second.
- Still no real difference.
- I'm starting to run out of ideas, but while looking at the `JSONSerializerSettings` I found I'd also left a custom `ContractResolver` in the original code, so lets try that! The `ContractResolver` isn't needed at all, but maybe it was impacting the performance.
- Code Changes:
- //AJ: Init Serializer
- JsonConvert.DefaultSettings = () =>
- {
- return new JsonSerializerSettings
- {
- ContractResolver = new PropertyNameMapContractResolver(new Dictionary<string, string>()
- {
- { "ID", "id" }
- })
- };
- };
- public class PropertyNameMapContractResolver : DefaultContractResolver
- {
- private Dictionary<string, string> propertyNameMap;
- public PropertyNameMapContractResolver(Dictionary<string, string> propertyNameMap)
- {
- this.propertyNameMap = propertyNameMap;
- }
- protected override string ResolvePropertyName(string propertyName)
- {
- if (this.propertyNameMap.TryGetValue(propertyName, out string resolvedName))
- return resolvedName;
- return base.ResolvePropertyName(propertyName);
- }
- }
- Results:
- Starting @ 22/06/2017 11:56:11
- GetDocumentClientAsync took 00:00:02.0503540, or 2050.354 ms.
- GetDocuments took 00:00:17.0165692, or 17016.5692 ms.
- UpsertDocumentsAsync took 00:10:48.9390534, or 648939.0534 ms.
- GetAllDocuments took 00:02:28.1758434, or 148175.8434 ms.
- GetDocumentsInArea took 00:00:13.1400179, or 13140.0179 ms.
- RunAsync took 00:13:49.6587200, or 829658.72 ms.
- Finished @ 22/06/2017 12:10:01
- 2 minutes 28 seconds / 13 seconds.
- It looks like we have a winner! I thought that this wouldn't be an issue, but it just goes to show you should check all the possible causes you find, not just the obvious ones!
- So here's the offending reproduction code in full:
- using System;
- using System.Collections.Generic;
- using System.Configuration;
- using System.Diagnostics;
- using System.Linq;
- using System.Runtime.CompilerServices;
- using System.Threading;
- using System.Threading.Tasks;
- using Microsoft.Azure.Documents;
- using Microsoft.Azure.Documents.Client;
- using Microsoft.Azure.Documents.Spatial;
- using Newtonsoft.Json;
- using Newtonsoft.Json.Serialization;
- namespace Repro.Cli
- {
- public class Program
- {
- static void Main(string[] args)
- {
- JsonConvert.DefaultSettings = () =>
- {
- return new JsonSerializerSettings
- {
- ContractResolver = new PropertyNameMapContractResolver(new Dictionary<string, string>()
- {
- { "ID", "id" }
- })
- };
- };
- //AJ: Init logging
- Trace.AutoFlush = true;
- Trace.Listeners.Add(new ConsoleTraceListener());
- Trace.Listeners.Add(new TextWriterTraceListener("trace.log"));
- //AJ: Increase availible threads
- //AJ: https://docs.microsoft.com/en-us/azure/storage/storage-performance-checklist#subheading10
- //AJ: https://github.com/Azure/azure-documentdb-dotnet/blob/master/samples/documentdb-benchmark/Program.cs
- var minThreadPoolSize = 100;
- ThreadPool.SetMinThreads(minThreadPoolSize, minThreadPoolSize);
- //AJ: https://docs.microsoft.com/en-us/azure/cosmos-db/performance-tips
- //AJ: gcServer enabled in app.config
- //AJ: Prefer 32-bit disabled in project properties
- //AJ: DO IT
- var program = new Program();
- Trace.TraceInformation($"Starting @ {DateTime.UtcNow}");
- program.RunAsync().Wait();
- Trace.TraceInformation($"Finished @ {DateTime.UtcNow}");
- //AJ: Wait for user to exit
- Console.WriteLine();
- Console.WriteLine("Hit enter to exit...");
- Console.ReadLine();
- }
- public async Task RunAsync()
- {
- using (new CodeTimer())
- {
- var client = await this.GetDocumentClientAsync();
- var documentCollectionUri = UriFactory.CreateDocumentCollectionUri(ConfigurationManager.AppSettings["databaseID"], ConfigurationManager.AppSettings["collectionID"]);
- //AJ: Prepare Test Documents
- var documentCount = 10000; //AJ: 10,000
- var documentsForUpsert = this.GetDocuments(documentCount);
- await this.UpsertDocumentsAsync(client, documentCollectionUri, documentsForUpsert);
- var allDocuments = this.GetAllDocuments(client, documentCollectionUri);
- var area = this.GetArea();
- var documentsInArea = this.GetDocumentsInArea(client, documentCollectionUri, area);
- }
- }
- private async Task<DocumentClient> GetDocumentClientAsync()
- {
- using (new CodeTimer())
- {
- var serviceEndpointUri = new Uri(ConfigurationManager.AppSettings["serviceEndpoint"]);
- var authKey = ConfigurationManager.AppSettings["authKey"];
- var connectionPolicy = new ConnectionPolicy
- {
- ConnectionMode = ConnectionMode.Direct,
- ConnectionProtocol = Protocol.Tcp,
- RequestTimeout = new TimeSpan(1, 0, 0),
- RetryOptions = new RetryOptions
- {
- MaxRetryAttemptsOnThrottledRequests = 10,
- MaxRetryWaitTimeInSeconds = 60
- }
- };
- var client = new DocumentClient(serviceEndpointUri, authKey, connectionPolicy);
- await client.OpenAsync();
- return client;
- }
- }
- private List<TestDocument> GetDocuments(int count)
- {
- using (new CodeTimer())
- {
- return External.CreateDocuments(count);
- }
- }
- private async Task UpsertDocumentsAsync(DocumentClient client, Uri documentCollectionUri, List<TestDocument> documents)
- {
- using (new CodeTimer())
- {
- //TODO: AJ: Parallelise
- foreach (var document in documents)
- {
- await client.UpsertDocumentAsync(documentCollectionUri, document);
- }
- }
- }
- private List<TestDocument> GetAllDocuments(DocumentClient client, Uri documentCollectionUri)
- {
- using (new CodeTimer())
- {
- var query = client
- .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- {
- MaxItemCount = 1000
- });
- var documents = query.ToList();
- return documents;
- }
- }
- private Polygon GetArea()
- {
- //AJ: Longitude,Latitude i.e. X/Y
- //AJ: Ring orientation matters
- return new Polygon(new List<LinearRing>()
- {
- new LinearRing(new List<Position>()
- {
- new Position(1.8567 ,51.3814),
- new Position(0.5329 ,51.4618),
- new Position(0.2477 ,51.2588),
- new Position(-0.5329 ,51.2579),
- new Position(-1.17 ,51.2173),
- new Position(-1.9062 ,51.1958),
- new Position(-2.5434 ,51.1614),
- new Position(-3.8672 ,51.139 ),
- new Position(-4.1578 ,50.9137),
- new Position(-4.5373 ,50.694 ),
- new Position(-5.1496 ,50.3282),
- new Position(-5.2212 ,49.9586),
- new Position(-3.7049 ,50.142 ),
- new Position(-2.1698 ,50.314 ),
- new Position(0.4669 ,50.6976),
- //AJ: Last point must be the same as first point
- new Position(1.8567 ,51.3814)
- })
- });
- }
- private List<TestDocument> GetDocumentsInArea(DocumentClient client, Uri documentCollectionUri, Polygon area)
- {
- using (new CodeTimer())
- {
- var query = client
- .CreateDocumentQuery<TestDocument>(documentCollectionUri, new FeedOptions()
- {
- MaxItemCount = 1000
- })
- .Where(document => document.Location.Intersects(area));
- var documents = query.ToList();
- return documents;
- }
- }
- }
- public class TestDocument : Resource
- {
- public string Name { get; set; }
- public Point Location { get; set; } //AJ: Longitude,Latitude i.e. X/Y
- public TestDocument()
- {
- this.Id = Guid.NewGuid().ToString("N");
- }
- }
- //AJ: This should be "good enough". The times being recorded are seconds or minutes.
- public class CodeTimer : IDisposable
- {
- private Action<TimeSpan> reportFunction;
- private Stopwatch stopwatch = new Stopwatch();
- public CodeTimer([CallerMemberName]string name = "")
- : this((ellapsed) =>
- {
- Trace.TraceInformation($"{name} took {ellapsed}, or {ellapsed.TotalMilliseconds} ms.");
- })
- { }
- public CodeTimer(Action<TimeSpan> report)
- {
- this.reportFunction = report;
- this.stopwatch.Start();
- }
- public void Dispose()
- {
- this.stopwatch.Stop();
- this.reportFunction(this.stopwatch.Elapsed);
- }
- }
- public class PropertyNameMapContractResolver : DefaultContractResolver
- {
- private Dictionary<string, string> propertyNameMap;
- public PropertyNameMapContractResolver(Dictionary<string, string> propertyNameMap)
- {
- this.propertyNameMap = propertyNameMap;
- }
- protected override string ResolvePropertyName(string propertyName)
- {
- if (this.propertyNameMap.TryGetValue(propertyName, out string resolvedName))
- return resolvedName;
- return base.ResolvePropertyName(propertyName);
- }
- }
- }
- [1]: https://docs.microsoft.com/en-us/azure/cosmos-db/geospatial#indexing
- [2]: https://docs.microsoft.com/en-us/azure/cosmos-db/request-units
- [3]: https://docs.microsoft.com/en-us/azure/cosmos-db/request-units-per-minute
- [4]: https://docs.microsoft.com/en-us/azure/cosmos-db/geospatial
- [5]: https://i.stack.imgur.com/fUYXf.png
Advertisement
Add Comment
Please, Sign In to add comment