Jump to content
jk3264

Express Feature failure mode

Recommended Posts

Hi.

The relatively new "Express" feature -- starting with some typical speed and then working up or down -- is a good idea operationally and statistically but iff the tests on which it is estimated are quite stable. In a situation (like the one I'm tryng to debug) in which speeds are fluctuating wildly (e.g., circa 80% of download readings in the 9-12 mpbs range but the other 20% clustering around 15-20 kbps, working from an average is a disaster. I'm seeing automatic tests starting with 7 MB downloads simply stall with several hours at less than 50%.

If there were an effective and very quick downgrade procedure, that might still be ok, but whatever downgrade procedure is built in seems (from observation) to be based, not on a timer, but on having the initial data download complete.

To save comments about the connection itself (more on that when I figure out what is going on), I've had packet analysers on both the LAN and between the local router and the cable modem and they are seeing fairly consistent local traffic density except when tests are being run. Running your tests from multiple computers on the LAN and use of different testing procedure yield roughly consistent results (the numbers my be different, but the wide fluctuations (and when they occur) are consistent.

IMO, there needs to be a way to easily disable the "Express" mechanism. Even if you fix the downgrade procedure, there will probably always be edge cases extreme enough for it to fail.

Share this post


Link to post
Share on other sites

Unless you have a dedicated line " T1-T3" ect , I do not see the relevance in testing two machines on the same lan simultaneously with the express test and getting your expected results.

In your situation , unless I'm not understanding you , I would test that way with the specific size test, and make sure your not using a VPN or remotely controlling the second machine , as it will throw off the test within you lan, not externally.

Also looking at netbios if your running a windows machine doing what your doing will throw internal results, not external.

Share this post


Link to post
Share on other sites

Hi.

The relatively new "Express" feature -- starting with some typical speed and then working up or down -- is a good idea operationally and statistically but iff the tests on which it is estimated are quite stable. In a situation (like the one I'm tryng to debug) in which speeds are fluctuating wildly (e.g., circa 80% of download readings in the 9-12 mpbs range but the other 20% clustering around 15-20 kbps, working from an average is a disaster. I'm seeing automatic tests starting with 7 MB downloads simply stall with several hours at less than 50%.

If there were an effective and very quick downgrade procedure, that might still be ok, but whatever downgrade procedure is built in seems (from observation) to be based, not on a timer, but on having the initial data download complete.

To save comments about the connection itself (more on that when I figure out what is going on), I've had packet analysers on both the LAN and between the local router and the cable modem and they are seeing fairly consistent local traffic density except when tests are being run. Running your tests from multiple computers on the LAN and use of different testing procedure yield roughly consistent results (the numbers my be different, but the wide fluctuations (and when they occur) are consistent.

IMO, there needs to be a way to easily disable the "Express" mechanism. Even if you fix the downgrade procedure, there will probably always be edge cases extreme enough for it to fail.

I see what you're talking about. I had a feeling that I was going to have to go back and rethink that... it's a good idea but I need to build a failsafe to make sure that doesn't happen, actually... just thinking of the word 'failsafe' gave me an idea. ... and I'm pretty sure it's exactly the fix you're thinking about. (damn, I think I have a solution before I've even dug into the code, lol)

For now, I've disabled this feature. Once I've worked on the idea a little more and re-enable it I'll make sure that there's a checkbox that will provide you the option to enable and disable this timesaver.

Thank you for your input, it's very valuable to me.

- CA3LE

Share this post


Link to post
Share on other sites

Okay, so I added options for that. The other idea I had is going to be more work than I'm willing to put in (on that) today.

There is now an Express Combined link in the menu and the Auto test page now has a checkbox that will make it easy to turn that feature on and off... because of your comments I've made this feature disabled by default. It's a cool feature but it may not be ready for default for a while... I have to teach it some more tricks first, lol.

Thanks for your suggestion and bug report. As you can see by how quickly I took care of this, I don't mess around. :twisted:

Share this post


Link to post
Share on other sites

Thanks very much. And "not messing around" is greatly appreciated. For amusement (perhaps fixed now as well, but the line speed went back up), the initial "express" behavior produced an even more interesting anomaly: once the speed averages came down into the kilobit range, one could click on "automatic" or "1 Mb" and get a message about it not being possible to perform tests outside a predefined range. The user thought something outside that range was being selected, but the system thought otherwise :-(

Share this post


Link to post
Share on other sites

Unless you have a dedicated line " T1-T3" ect , I do not see the relevance in testing two machines on the same lan simultaneously with the express test and getting your expected results.

The exchanges with CA3LE focused on the problem I was concerned about, but just to clarify...

I didn't say "simultaneously". The procedure I was using was to try to run an automatic (repeated) test on one machine that wasn't doing anything else. When I saw a test completed with a very slow download rate (note that we are talking about a couple of orders of magnitude here), I manually reran the test from another machine (different loading, different hardware, different configuration, even in some cases different operating system) as an informal means of verifying that the lousy results were not simply an artifact of a transient on the test machine.

In your situation , unless I'm not understanding you , I would test that way with the specific size test, and make sure your not using a VPN or remotely controlling the second machine , as it will throw off the test within you lan, not externally.

The problem I was seeing (see the rest of the thread) was the the "express" feature was overriding even the specific size tests. Ask for a megabyte; get a complaint about not being able to run tests below 97 (?) KBps.

Also looking at netbios if your running a windows machine doing what your doing will throw internal results, not external.

Not sure I understand this. When I said "packet analyzer" I meant a dedicated device that was actually sniffing packets on the LAN. And the only reason for doing that (and for making measurements on the router) was to be sure that some odd device on the LAN wasn't creating a packet flood and swamping the connection. Given that the internal LAN is running with gig Ethernet switches, it is not obvious what netbios on a client machine could possibly tell me that would be relevant. When I see an odd pattern in a WAN measurement, my first instinct is to try to verify that it isn't actually being caused by something going on with local machines (hence repeating the test on a second machine) or otherwise on the LAN.

Share this post


Link to post
Share on other sites

What I meant by the netbios comment was a suggestion to kill it, I was assuming you were running them statically and taking test simultaneously to locate the anomaly in a QOS situation.

Glad you have it straightened out.

Share this post


Link to post
Share on other sites

@mudmanc4: Ah, that is clear now. Thanks. Netbios has been off for a rather long time -- years at least. And it wouldn't have even occurred to me to try to use this sort of tool to sort out QOS problems (perhaps just lack of imagination :-)). But now I at least understand your comment. Much appreciated.

Share this post


Link to post
Share on other sites

Thanks very much. And "not messing around" is greatly appreciated. For amusement (perhaps fixed now as well, but the line speed went back up), the initial "express" behavior produced an even more interesting anomaly: once the speed averages came down into the kilobit range, one could click on "automatic" or "1 Mb" and get a message about it not being possible to perform tests outside a predefined range. The user thought something outside that range was being selected, but the system thought otherwise :-(

That was corrected shortly after (or while) you posted. ;)

... and, your welcome sir! :)

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.

×
×
  • Create New...