Hammering Crowd

There have been a few customers wondering how Crowd scales (outside of it's integration with JIRA/Confluence). Unfortunately, the answers we could think of ranged from "..yes" to "nfi" - so we decided to take a look at load testing Crowd.

Since Crowd offers a bunch of connection points for various applications, directories and databases, it's hard to give an accurate single metric for scalability. One particular evaluator was asking how Crowd would scale for 1 million users using an internal directory (MySQL) with a PHP application.

It's a massive number given that we consider 20,000 users a large user base.

Getting a million users

We ran a script to insert users into our internal directory, starting with user0 and ending with user999999 - taking 5 hours.

I came back in this morning and found that Dave, our team lead, had already verified that it was possible to authenticate and use the Crowd console without issues. This shows us that Crowd is capable of not falling over if there are 1 million users in it's repository. He also decided to double-check that JIRA integration was very broken with this many users :)

Load Testing

PHP, Java, .NET or whatever is unlikely to make a huge difference if your web-services stack is slick. What's more relevant is the calls your application makes to Crowd. You can bet that findAllPrincipalNames will take much longer than findPrincipalByName for example.

As the evaluator didn't have a concrete idea of the number and nature of the calls his application would be making, we decided to test out the fundamental calls made to Crowd:


  1. Authenticate

  2. Validate token

  3. Find user from token

When you go to an application, your likely to log in (authenticate) once, but your likely to perform many secure operations (each requiring a valid token and the corresponding user) once logged on. Thus we vaguely approximate 100 token checks per authentication call for our load test. In reality you're likely to need fewer authentications.

We had two ideas for load testing: hammer the crap out of it or approximate a reasonable load. We decided to go with the hammering option as it's possible to extrapolate performance under a reasonable load from that data - and it's much faster to simulate than mimic the 3-30 seconds a user would take to read a web page before clicking.

The Hammering

Hammering Crowd means see how many concurrent threads Crowd can service. So we take n users and launch n threads. Each thread performs:

for 1..100
{
  authenticatePrincipal()
  for 1..100
  {
    verifyToken()
    findPrincipalByToken()
  }
}

Which is ~10,000 requests to Crowd. Note that this test is equivalent to something logging in and pressing refresh 100 times (really fast!) and repeating that 100 times.

Clearly with a handful of threads, you'd expect Crowd to get smashed.

For the load test, Crowd and MySQL were on the same box: a 4 core Mac Pro, networked over a 100MBit line to the client, residing on a separate box. Check out the results:

crowd-hammer-table.png

crowd-hammer-graph.png


Analysis

Making sense of the data:


  • Authentications/authentication verifications are pretty fast (~10ms).

  • Crowd performs optimally when there are 4-6 threads hammering it at the same time and doesn't appear to show signs of death for more concurrent threads.

  • The JVM heap was 128MB and it didn't die, ie. Crowd is not hogging memory since there's only a handful of entities it needs to load up for this authentication test.

  • Load could be limited by the generation box, however, the generation box was an 8 core beast whereas the Crowd server was on a 4 core box.

  • Crowd seems to scale for 15+ concurrent threads hammmering it with authentication requests. Overnight, we ran a 50-concurrent-threads test which had an average request service time of 8.26ms. Conversely, this translates to 120 requests serviced per second.

  • At 50 concurrent threads, we are still not maxing out the CPUs although their idle time is decreasing. We could push Crowd even further until it was either CPU, disk or network bound.

  • We can extrapolate these results and overestimate "reasonable usage" to allow for 10 seconds between authentication checks. This means that Crowd could handle 1200 active users. Note that 10 seconds is an overestimate, especially if client libraries cache authentication for much longer (usually around 2 minutes).

  • This is still a basic test and we should investigate broader performance testing of Crowd's API.

Going Forward

Load testing is important. It's even more important when you're middleware. Although these metrics are a start, we should consider further performance testing:


  • Various directories: OpenLDAP, ActiveDirectory.

  • Various databases: Postgres, Oracle.

  • Load testing real applications integrated with Crowd (eg. Confluence, JIRA): might be good to compare how much fat Crowd adds to an applications standard repository.

  • Replaying logs from customers' (or our own) applications for automated load testing of Crowd (without needing to run a specific client application).

  • Profiling: determine which methods are letting us down and optimise based on potential benefit.

There are two sides to considering Crowd and load, and the first is to ensure the Crowd Server is lean and mean. The side which we didn't examine in this post, particularly useful for Java clients, is to ensure that our client libraries are smart and efficient - which boils down to effective caching - only making requests to the Crowd server when actually necessary.


We're working on making Crowd, and Crowd integration, even slicker in 1.4!

  • Comments Off

How Technical Writers Use Confluence and DITA XML

dynamic_toc_sequoyah.pngAnne Gentle writes about a presentation on a customer and documentation wiki sourced with DITA topics by Lisa Dyer of Lombardi Software at the February Central Texas DITA User Group meeting. Lisa's presentation explores how the company is using DITA and a wiki as the framework for collaborative information development, both internally and with customers who have a support login.

First, what's DITA?

It stands for Darwin Information Typing Architecture, and, "is an XML-based architecture for authoring, producing, and delivering technical information." (Source: Wikipedia)

DITA was originally developed by IBM and encourages content to be created & organized by topics, then published in multiple formats like HTML, PDF, and online help formats used by Java-based projects, and companies like Microsoft and Oracle.

Lisa's Presentation

On the importance of starting with internal use only at first:

Lisa Dyer recommends a pilot wiki, internal only at first, to ferret problems out while building in time to fix the problems. Michele Guthrie from Cisco... also has found that internal-only wikis helped them understand the best practices for wiki documentation.

That way, when you open the wiki to the external community, the internal contributors from your organization are well-versed in wiki use and ready to help nurture the external community's growth. That last thing you want is the internal people just getting used to the wiki at the same time as the external folk.

On choosing an open source or enterprise wiki:

Lisa said to ask questions while evaluating, such as where do you want the intellectual property to develop? Will you pay for support? Who are your key resources internally, and do you need to supplement resources with external help?

They found it faster to get up and running and supported with an enterprise engine and chose Confluence, but she also noted that you "vote" for updates and enhancements with dollars rather than, say, community influence. (Editorial note - I'm opining on whether you get updates to open source wiki engines through community influence.)

Anne's editorial note makes a good point. For Confluence, we have a public JIRA instance where anyone can raise an issue and others can vote on it - which helps the development team get direct feedback on what customers need. This is a good example of combining voting with dollars (when you buy a new software license or continuing maintenance) and community influence.

Lisa's talk also looks at topics like Getting DITA to talk to the wiki, creating a wiki table of contents from a DITA map, and some of the limitations and considerations to think about when starting a project like this.

  • Comments Off

How technical writers use Confluence and DITA XML

dynamic_toc_sequoyah.pngAnne Gentle writes about a presentation on a customer and documentation wiki sourced with DITA topics by Lisa Dyer of Lombardi Software at the February Central Texas DITA User Group meeting. Lisa's presentation explores how the company is using DITA and a wiki as the framework for collaborative information development, both internally and with customers who have a support login.

First, what's DITA?

It stands for Darwin Information Typing Architecture, and, "is an XML-based architecture for authoring, producing, and delivering technical information." (Source: Wikipedia)

DITA was originally developed by IBM and encourages content to be created & organized by topics, then published in multiple formats like HTML, PDF, and online help formats used by Java-based projects, and companies like Microsoft and Oracle.

Lisa's Presentation

On the importance of starting with internal use only at first:

Lisa Dyer recommends a pilot wiki, internal only at first, to ferret problems out while building in time to fix the problems. Michele Guthrie from Cisco... also has found that internal-only wikis helped them understand the best practices for wiki documentation.

That way, when you open the wiki to the external community, the internal contributors from your organization are well-versed in wiki use and ready to help nurture the external community's growth. That last thing you want is the internal people just getting used to the wiki at the same time as the external folk.

On choosing an open source or enterprise wiki:

Lisa said to ask questions while evaluating, such as where do you want the intellectual property to develop? Will you pay for support? Who are your key resources internally, and do you need to supplement resources with external help?

They found it faster to get up and running and supported with an enterprise engine and chose Confluence, but she also noted that you "vote" for updates and enhancements with dollars rather than, say, community influence. (Editorial note - I'm opining on whether you get updates to open source wiki engines through community influence.)

Anne's editorial note makes a good point. For Confluence, we have a public JIRA instance where anyone can raise an issue and others can vote on it - which helps the development team get direct feedback on what customers need. This is a good example of combining voting with dollars (when you buy a new software license or continuing maintenance) and community influence.

Lisa's talk also looks at topics like Getting DITA to talk to the wiki, creating a wiki table of contents from a DITA map, and some of the limitations and considerations to think about when starting a project like this.

  • Comments Off

How technical writers use Confluence and DITA XML

dynamic_toc_sequoyah.pngAnne Gentle writes about a presentation on a customer and documentation wiki sourced with DITA topics by Lisa Dyer of Lombardi Software at the February Central Texas DITA User Group meeting. Lisa's presentation explores how the company is using DITA and a wiki as the framework for collaborative information development, both internally and with customers who have a support login.

First, what's DITA?

It stands for Darwin Information Typing Architecture, and, "is an XML-based architecture for authoring, producing, and delivering technical information." (Source: Wikipedia)

DITA was originally developed by IBM and encourages content to be created & organized by topics, then published in multiple formats like HTML, PDF, and online help formats used by Java-based projects, and companies like Microsoft and Oracle.

Lisa's Presentation

On the importance of starting with internal use only at first:

Lisa Dyer recommends a pilot wiki, internal only at first, to ferret problems out while building in time to fix the problems. Michele Guthrie from Cisco... also has found that internal-only wikis helped them understand the best practices for wiki documentation.

That way, when you open the wiki to the external community, the internal contributors from your organization are well-versed in wiki use and ready to help nurture the external community's growth. That last thing you want is the internal people just getting used to the wiki at the same time as the external folk.

On choosing an open source or enterprise wiki:

Lisa said to ask questions while evaluating, such as where do you want the intellectual property to develop? Will you pay for support? Who are your key resources internally, and do you need to supplement resources with external help?

They found it faster to get up and running and supported with an enterprise engine and chose Confluence, but she also noted that you "vote" for updates and enhancements with dollars rather than, say, community influence. (Editorial note - I'm opining on whether you get updates to open source wiki engines through community influence.)

Anne's editorial note makes a good point. For Confluence, we have a public JIRA instance where anyone can raise an issue and others can vote on it - which helps the development team get direct feedback on what customers need. This is a good example of combining voting with dollars (when you buy a new software license or continuing maintenance) and community influence.

Lisa's talk also looks at topics like Getting DITA to talk to the wiki, creating a wiki table of contents from a DITA map, and some of the limitations and considerations to think about when starting a project like this.

  • Comments Off

How do you use a wiki? Poll results

On Monday, I posted a reader poll over on Grow Your Wiki asking how people use wikis in organizations. As of last night, 127 people responded, and here's what they had to say:

poll-results-wiki-use.jpg

There were three respondents who chose "Other", and here are their specific responses: Managing classroom information, garbage trash, and audits. Now, I can't really say much about garbage trash, but I can comment on the other two "other" uses:

Managing classroom information is an excellent wiki use. In fact, I got started using wikis doing something very similar - building a wiki-based science curriculum.

Using a wiki for audits is a great use too - besides having all your information easily accessible in one place, the revision history the wiki maintains for every page is very audit-friendly since it shows a complete trail of who contributed information, when they did so, and what was added, changed and removed.

Jeffrey Keefer (Twitter) commented on the post and asked about a poll for education uses. That's coming next week! He also asked for more information on some of the uses I included in the poll, like project management. Watch for that next week too.

  • Comments Off

JQuery, a beach and the Confluence Team

Last Friday the Sydney-based Confluence team headed out to world famous IT-suburb Manly for a one-day workshop to learn more about advanced JavaScript and JQuery, the JavaScript-library that Confluence will standardize upon. Obviously it was pure coincidence that the weather was fine, the coffee great and the beers cold, and we were positively surprised to find out that a nice beach was just across the road...

Upon arrival in Manly, Dmitry, our revered JavaScrip-guru and general CSS&Typography fanatic introduced us to the wonderful world of Script-Craft, which included closures, prototypes and loose typing. Being hardcore backend Java-developers didn't exactly give us a big advantage here — our heads were spinning, but we did pretty well on the exercises. Or so we think. Would you know how to:

Add to any number the function "times" which will run some function N times.
(5).times(function () {alert(this);}); 

Click here for the full set of tasks [PDF]

The original plan was to squeeze in 6.5 hours of presentations and exercises on JQuery, but to be honest, we only got to 5 hours. Let me explain: I blame the weather! Huge clouds were beginning to show on the horizon, so we ditched our plans (agile!) and hit the beach right after lunch. When we returned to lessons, our energy levels were quite low after that, so after Matt's presentation of JQuery, and another hour of exercises we called it a day, and headed straight for a cold beer in a nearby bar.

Photos!

confluence-manly-jetcat.jpg

In the morning, everyone was still looking forward to a nice day on the beach. Little did they know of what was ahead...

confluence-manly-dmitry.jpg
... like Dmitry talking about Loose Typing in JavaScript. Yuck! :-)

confluence-manly-concentrate.jpg

Never has the Confluence team worked as focussed as this. From left to right: Andrew, Agnes, Anatoli, Adnan (yeah, we try to hire mainly A-players), Don, Dave and Chris.

confluence-manly-matt.jpg
Matt talks on JQuery. The crowd pretends to understand it.

confluence-manly-esplanade.jpg
Matthew and Adnan walking down the esplanade, probably discussing the intricacies of product management.

beach.jpg
Parts of the Confluence team returning from the surf. From left to right: Don, Andrew, Dave, Matthew and Anatoli.

confluence-manly-david-per.jpg


David got the shortest straw and had to pair with Per. Per does what he is best at, and delegates all the hard work back to David.

confluence-manly-chris-anatoli.jpg


Chris and Anatoli working on JQuery tasks

Next steps

I heard plenty of "when will we do this again?" We will aim at doing the next workshop in three months. I would suggest other teams try something similar as well. The planning overhead is smaller than it seems, especially if you have regular internal training every week anyway, and in the end we spent around $40 per person for the whole day — not much more than what we spend on our regular infamous post-release lunches.

  • Comments Off

JQuery, a beach and the Confluence Team

Last Friday the Sydney-based Confluence team headed out to world famous IT-suburb Manly for a one-day workshop to learn more about advanced JavaScript and JQuery, the JavaScript-library that Confluence will standardize upon. Obviously it was pure coincidence that the weather was fine, the coffee great and the beers cold, and we were positively surprised to find out that a nice beach was just across the road...

Upon arrival in Manly, Dmitry, our revered JavaScrip-guru and general CSS&Typography fanatic introduced us to the wonderful world of Script-Craft, which included closures, prototypes and loose typing. Being hardcore backend Java-developers didn't exactly give us a big advantage here — our heads were spinning, but we did pretty well on the exercises. Or so we think. Would you know how to:

Add to any number the function "times" which will run some function N times.
(5).times(function () {alert(this);}); 

Click here for the full set of tasks [PDF]

The original plan was to squeeze in 6.5 hours of presentations and exercises on JQuery, but to be honest, we only got to 5 hours. Let me explain: I blame the weather! Huge clouds were beginning to show on the horizon, so we ditched our plans (agile!) and hit the beach right after lunch. When we returned to lessons, our energy levels were quite low after that, so after Matt's presentation of JQuery, and another hour of exercises we called it a day, and headed straight for a cold beer in a nearby bar.

Photos!

confluence-manly-jetcat.jpg

In the morning, everyone was still looking forward to a nice day on the beach. Little did they know of what was ahead...

confluence-manly-dmitry.jpg
... like Dmitry talking about Loose Typing in JavaScript. Yuck! :-)

confluence-manly-concentrate.jpg

Never has the Confluence team worked as focussed as this. From left to right: Andrew, Agnes, Anatoli, Adnan (yeah, we try to hire mainly A-players), Don, Dave and Chris.

confluence-manly-matt.jpg
Matt talks on JQuery. The crowd pretends to understand it.

confluence-manly-esplanade.jpg
Matthew and Adnan walking down the esplanade, probably discussing the intricacies of product management.

beach.jpg
Parts of the Confluence team returning from the surf. From left to right: Don, Andrew, Dave, Matthew and Anatoli.

confluence-manly-david-per.jpg

David got the shortest straw and had to pair with Per. Per does what he is best at, and delegates all the hard work back to David.

confluence-manly-chris-anatoli.jpg


Chris and Anatoli working on JQuery tasks

Next steps

I heard plenty of "when will we do this again?" We will aim at doing the next workshop in three months. I would suggest other teams try something similar as well. The planning overhead is smaller than it seems, especially if you have regular internal training every week anyway, and in the end we spent around $40 per person for the whole day — not much more than what we spend on our regular infamous post-release lunches.

  • Comments Off

4 challenges to wiki adoption in organizations: #1 high-level resistance

Sandy Kemsley writes about 4 challenges to social media/enterprise 2.0 adoption that organizations. The first is resistance at the high-level:

...higher-level people are more resistant to bringing in Enterprise 2.0 technologies because it represents a democratization of content and a relative loss of power at their level.

What they have to realize is that people below them will bring these tools in, even if they have to do so under the radar. She continues:

...it explains a lot about why I...have so completely embraced social applications, and actively push their use in my customers' organizations: as an independent consultant/analyst, I have no corporate hierarchy and therefore see the value without a filter of fear.

That's powerful, and it shows why organizations with 50,000 or 200,000 people need to start thinking like independent consultants when it comes to sharing information.

  • Comments Off

JIRA has Sixth Sense (Analytics)

How much time do you spend in JIRA, coding, writing requirements, or... um, blogging? Most time tracking tools only give you partial insight into some of these figures, but if you're really wanting a precise figure, you might turn to Sixth Sense Analytics. They have recently built an integration to JIRA.

Redmonk produced a video showing how the Sixth Sense Analytics integration works with JIRA, You can click below or see it here.

Get the Flash Player to see this player.

var s1 = new SWFObject("http://www.atlassian.com/video/flvplayer.swf","single","425","325","7"); s1.addParam("allowfullscreen","true"); s1.addVariable("file","http://blip.tv/file/get/Redmonk-6thSenseAndJIRA392.flv"); s1.addVariable("autostart","false"); s1.addVariable("backcolor","0x003366"); s1.addVariable("frontcolor","0xCCCCCC"); s1.addVariable("lightcolor","0xFFFFFF"); s1.addVariable("image","http://downloads.atlassian.com/videos/tv/6thsense.png"); s1.write("player");
  • Comments Off

Congratulations to GigaSpaces

GigaSpaces Technologies today announced that its Wiki Documentation Portal has won a prestigious professional accolade, the Award of Excellence in the Society for Technical Communications (STC) California and Silicon Valley Technical Communication Competition.

The wiki in question is Confluence. We wrote a case study about their use of the wiki to write documentation many months ago. A hearty congratulations to GigaSpaces!

More in-depth information about using wikis for publishing online help and documentation can be found on a blog published by one of Atlassian's tech writers, Sarah Maddox.

  • Comments Off
Next Page »