Hi,
I have a cloud service WorkerRole that looks for a message on a queue and then runs a long running process from that message's data. In the Azure portal I have set this up to scale by Queue (1 target per machine, scale up by 1 instance 20 minutes after last scale action, scale down by 1 instance 120 minutes after last scale action). It starts with 1 instance running after I deploy it.
I ran a test where I placed 3 messages in the queue. With the scaling I mentioned above, here is what happens:
1) the current running instance (#0) removes the first message and begins processing it.
2) A short while later (5 minutes or so), the portal says that the instances are TRANSITIONING with 1 instance running and 1 starting (which eventually starts running so that 2 instances are running).
3) It appears that the 2nd instance begins processing the 2nd message and then is stopped. Then the 2nd instance reads the 3rd message and completely processes it. The 2nd instance looks like it was stopped after less than a minute of processing the 2nd message.
Any idea why the 2nd instance would apparently restart like this so that the 2nd message never gets completely processed?
If I turn off scaling and run this test with 2 instances, all 3 messages get read eventually and process without problems. If I run this test with just one instance (no scaling), all 3 messages get read (each message is read after the previous one has finished processing) and processed correctly.
Thanks!