Search

Mobile tag

About Me

I am the "IBM Collaboration & Productivity Advisor" for IBM Asia Pacific. I'm based in Singapore.
Reach out to me via:
Follow notessensei on Twitter
(posts)
Skype
Sametime
IBM
Facebook
LinkedIn
XING
Amazon Store
Amazon Kindle

Twitter

Domino Upgrade

VersionSupport end
5.0
6.0
6.5
7.0
Upgrade to 8.5x now!
(see the full Lotus lifcyle) To make your upgrade a success use the Upgrade Cheat Sheet.
Contemplating to replace Notes? You have to read this! (also available on Slideshare)

Languages

Other languages on request.

Visitors

Useful Tools

Get Firefox
Use OpenDNS
The support for Windows XP is coming to an end and has . Time to consider an alternative to move on. sounds like a lot of time, but, like an object in a mirror, it is closer than you think.

« Off to the 'sphere | Main| DAOS and Transaction Logs »

Myth Buster: NSF doesn't scale

QuickImage In a lot of customer discussions I hear: "Oh we need to go RDBMS since Notes' NSF doesn't scale". Which is an interesting statement. When digging deeper into that statement, I get a number of reasons why they believe that:
  • Somebody told them
  • They had a problem in R4
  • The application got slower and slower over the years (and yes it's that same old server
  • The workflow application using reader fields is so slow
  • They actually don't know
Then I show them this:
More than 1.6 Million documents in a NSF
(atually not the graphic, but the live property box using my Notes client). The question quickly arises how the acceptable performance of such a database can be achieved. There are a few pointers to observe:
  • Watch your disk fragmentation (troubleshooting tip on the Notes and Domino wiki)
  • Be clear about your reader and author fields usage. In case the RDBMS fans insist on their solution, ask them to build the reader field equivalent in RDBMS and measure performance then.
  • Watch your view selection formulas carefully. (You don't use @Now, @Today, @Yesterday or @Tomorrow do you?)
  • You want to use DAOS to keep attachments out of the NSF (helps with fragmentation) -- don't forget to buy a disk for your transaction log.
As usual YMMV.

Comments

Gravatar Image1 - pish posh. Let me know when you break 2 million documents. Then we'll be in the same ballpark. At 5 million, I'll get excited.

Of course, it helps to know how to do more with just one index, instead of the 150 that most Notes apps tend to have.

Gravatar Image2 - We have Notes databases of a similar size, however we find that in the case of a view corruption, it can take over 6 hours to rebuild one view. While it does't have any date formulas it does have multiple sortable columns, which obviously increases the size of the index (and therefore the time taken to rebuild).

Because of this, we are in the middle of reviewing the design - not moving away from Notes but rather looking at the data and working out ways to either archive the data or provide different ways to access the specific data they want.

While others would look at our architecture and say "Notes doesn't work - let's go to a RDBMS", we're saying, lets look at the architecture and the data and work out how we can improve and optimise it.

I'm certainly not anti-RDBMS, and a portion of our new application will be Ruby on Rails/MySQL, but with any system it's worth looking into the data and how and why its used and determine the best way of getting access to it.

Just my two cents worth...

Gravatar Image3 - Our largest is only 1.5 mill but it takes up 15.2Gb. The view indexes (indices?) are over 10Gb of that space (although the view index dialog only says 2Gb - not sure what's up with that) - check out { Link }

This is perhaps what Nathan was talking about - multiple views doing the same or similar thing. Something we'll be looking at in the design review.

Gravatar Image4 - Currently at 2.4 million, but we have to be very careful with mass changes on data because the server crawls when the indexer runs.

Gravatar Image5 - NSF scales fine. There's three pieces that don't:

1. Readers' fields (you've covered this)
2. Complex or time-based views
3. Any self-written pseudo-indexing code (e.g. JOINs, or if you wrote code to allow your users to sort a view by category totals.) If your indexing options can't be specified in a view you're stuck writing script/java/ssjs code to do really heavy crunching of data sets.

I've got several DBs with anywhere from 3mil to 10mil+. The biggest have views set to "manual" and are fairly static, and those fly.

But I've also got DBs with a mere 50,000 docs that fall under #3 and they're slow in comparison.

Gravatar Image6 - We let an archive accidentally build to 4.3 million records.

Granted, this database is not used for transactions, it gets data dumped in weekly... but it runs fine.

Post A Comment

Please note: Comments without a valid and working eMail address will be removed. This is my site, so I decide what stays here and what goes.

:-D:-o:-p:-x:-(:-):-\:angry::cool::cry::emb::grin::huh::laugh::rolleyes:;-)

Disclaimer

This site is in no way affiliated, endorsed, sanctioned, supported, nor enlightened by Lotus Software nor IBM Corporation. I may be an employee, but the opinions, theories, facts, etc. presented here are my own and are in now way given in any official capacity. In short, these are my words and this is my site, not IBM's - and don't even begin to think otherwise. (Disclaimer shamelessly plugged from Rocky Oliver)
© 2003 - 2013 Stephan H. Wissel - some rights reserved as listed here: Creative Commons License
Unless otherwise labeled by its originating author, the content found on this site is made available under the terms of an Attribution/NonCommercial/ShareAlike Creative Commons License, with the exception that no rights are granted -- since they are not mine to grant -- in any logo, graphic design, trademarks or trade names of any type. Code samples and code downloads on this site are, unless otherwise labeled, made available under an Apache 2.0 license. Other license models are available on written request and written confirmation.