SharePoint 2007: Troubleshooting ‘Unexpected Error’ while editing web parts

One of the great things about SharePoint is the ability for non-technical users to update page content without the assistance of a development team. On the other hand, one of the terrible things about SharePoint is the ability for non-technical users to update page content without the assistance of a development team.

I recently had a client that was experiencing difficulties when attempting to edit web parts on certain pages in their Intranet site, running on SharePoint 2007. For starters, clicking on the web part’s Edit menu would make the browser window scroll back to the top of the page, rather than opening the Edit menu as it should. No big deal. Who needs to edit a web part, anyway?

If that wasn’t bad enough, any time they would drag and drop a web part to a different web part zone on the page, they would receive the wonderful “An unexpected error has occurred” error, preventing them from saving the changes to the page.

To add some madness to the mayhem, the issues only occur in Internet Explorer. Since IE is the only browser officially supported by SharePoint 2007, SharePoint uses a dumbed-down page editing interface in other browsers. While much less user-friendly, the simplified interface would allow my client to successfully edit and move web parts without encountering errors.

Troubleshooting the problem

As my first step in troubleshooting the problem, I edited the web.config file, turning off Custom Errors and enabling the stack trace output. The hope was that this information would point me in the right direction to solving the problem. While the new error message is much less cryptic than “unexpected error”, it was rather unexpected and perplexing at first.

Guid Error

GUID should contain 32 digits with 4 dashes.” Gee, thanks! I know what a properly formatted GUID should look like, but a GUID is never used as user input on the page. Why is it passing in an invalid GUID to the SPWebPartManager? Shouldn’t SharePoint be able to keep track of these GUIDs on its own? This got me to thinking that something must be preventing SharePoint from determining the GUID of one of the web parts, thus causing the error. Buy why?

The page contained Rich Content area at the top with user-defined content in it, as well as a couple of Content Editor web parts, so I decided to investigate. And sure enough, I found the culprit.

When the user pasted in the HTML content for the Rich Content area, it must have been copied from a Content Editor web part at some point. Unfortunately, not only was the inner content of the web part pasted into the content area, but the outer wrapping DIV was pasted as well, including the ID of the previous web part. This was creating a conflict of web part IDs, causing the JavaScript function to fail when determining the GUIDs of the web parts on the page.

After removing the extra wrapping DIV and the ID from the content area and saving the page, the web parts began to function properly. If I took one lesson away from this scenario, it’s that users aren’t always careful about their input, and SharePoint is not good at offering help to the non-technical user. As with any Content Management System, SharePoint is prone to user error, one of the most difficult types of errors to troubleshoot.

SharePoint: Successfully Debugging Custom Timer Jobs

Timer Job
You can even write a timer job to send you an email when your eggs are done.

I’ve developed and debugged my share of custom SharePoint Timer Jobs in my day, yet there is still a rather tricky ‘gotcha’ that seems to trip me up every time, even though I am well aware of it. Perhaps writing this article about it will help me remember next time?

My debugging style tends to be something like this: Change a line of code, run the change in debug mode. Change a line of code, run it in debug mode. You get the picture. While not always the most efficient way, it’s a fairly reasonable way to get the job done. Once my change allows the debugger to get past the error that was being thrown, I know I can move on to the next part of the logic.

However, when you’re dealing with SharePoint, no process is this simple. SharePoint likes to cache lots of stuff, sometimes causing it to behave in a somewhat unexpected manner. The same goes for DLLs. In order to increase system performance, SharePoint’s timer service caches all of the assemblies it needs to run all defined timer jobs, including any custom-developed jobs.

Services
The Services window on Windows Server 2008 R2 Standard

So, what does this mean?

Even if you change your code and redeploy the solution to SharePoint, the Timer Job will execute using the old, cached DLL, rather than the new DLL from the GAC. The Visual Studio debugger will be unaware that it is not using the new DLL, and can lead to some interesting results. If the debugger seems to be acting like a drunken mental patient, this is huge clue that SharePoint is running the old DLL.

Remember this:

EVERY time you make a single change to your timer job and wish to debug, you must restart the Windows SharePoint Services Timer service in SharePoint 2007, and the SharePoint 2010 Timer service in SharePoint 2010. This can be done by navigating to Start > Administrative Tools > Services, right-clicking on the service, and selecting Restart. The next time the job runs, it will use the new version of the DLL that you deployed.

To avoid this problem, only one additional step is necessary in my debugging process. It’s simple and easy, but it manages to get me every time.

SharePoint 2010: Mysterious Errors Using Query String Parameters

SharePoint 2010
I am SharePoint... Feel my wrath!

One of the things I love about my job is that satisfying feeling of accomplishment that I get when I solve one of SharePoint’s quirky difficulties. If it weren’t for this feeling of euphoria that comes along every so often, I’d have gone insane long ago. Thankfully, SharePoint has no shortage of strange behavior and head-scratching moments.

One of my colleagues recently ran into a strange error while developing a custom Application Page. No matter how he was catching and handling runtime errors, every time an exception was raised during the Page Load event, the page would crash, displaying a perplexing error message and stack trace.

Specifically, he’d receive a “No item exists at . It may have been deleted or renamed by another user” error. What item? The one that the page isn’t using at all? Obviously. The corresponding stack trace wasn’t too helpful either. When stepping through the debugger, it was clear that the Page Load event was executing successfully, but the page was still crashing, even when the entire method was wrapped in a try/catch block with proper exception handling.

At this point, I thought of something. The page was utilizing query string parameters to receive data and pass data to itself across postbacks, so perhaps something was wrong with one of the parameters? I noticed one of the parameters he was using was named “ID”. I suggested we change the name of this parameter to something else, and lo and behold, the problem was solved.

The Moral of The Story

For reasons I still don’t quite understand, although it does sort of make sense, the “ID” query string parameter is a reserved keyword in SharePoint. Any time this parameter is present, SharePoint tries to do something internally, which sometimes makes it take a crap. Oddly enough, the problem only happens when an exception is raised during code execution, regardless of error handling. The moral of this story is to NEVER use a query string parameter named “ID” while developing for SharePoint.

There are several other query string parameters one should not use as well, and most (but not all) of them are far more obvious than the ambiguously-named “ID” parameter:

  • FeatureId
  • ListTemplate
  • List
  • ID
  • VersionNo
  • ContentTypeId
  • RootFolder
  • View
  • FolderCTID
  • Mode
  • Type
  • PageVersion
  • IsDlg
  • Title
  • _V3List_

(Thanks to Stefan Goßner over at TechNet Blogs for this list.)

SharePoint: Resolving Access Denied errors for Site Owners

Recently, I experienced a very strange problem while working on a client’s SharePoint 2007 install. SharePoint’s permission management isn’t always the easiest or most intuitive, but for the most part it works pretty well. And then there are the head-scratchers, which make absolutely no sense until the cause of the problem is discovered.

The Problem

Users belonging to the Site Owners group were receiving “Access Denied” errors on a particular site. The Site Owners group has Full Control permissions, so logically they shouldn’t be receiving “Access Denied” for any reason, unless a specific page or library does not inherit its permissions from the site. Thus, this was the first thing I checked. All of the lists in the site were indeed inheriting permissions, so this could not possibly be the issue.

Access Denied
But I'm a Site Owner! I have Full Control!

The “Access Denied” error made it clear that the users in the Site Owners group did not have permissions to access something, but what could it be? After Googling around for a while, I found a blog post detailing a similar issue and the possible solution. In the case detailed by the blogger, the Master Page Gallery for the site collection had been set up with custom permissions. Even though users had Full Control of the site, they did not have permissions to load the Master Page, causing the Access Denied error throughout the site. This prompted me to check the Master Page Gallery permissions on my client’s site, but I found this was not the cause.

What could possibly be the cause?

With no other ideas, I decided to look in the SharePoint Group settings and make sure they were configured correctly. I noticed that all of the groups were set up so that only the group members could view membership of a group. Essentially, this means that someone in the Site Owners group cannot see which users are in the Site Members or Site Visitors groups, or any other custom groups that may be configured.

Somehow, SharePoint was trying to view group membership of a group to which the logged-in user was not a member. I toggled the group settings so that so that everyone could view group membership, and Voila! The “Access Denied” error went away.

Group Settings

Now that I’d identified the source of the error, I needed to identify the root cause. Why was SharePoint trying to view group membership using the current logged-in user?

This particular site had many custom web parts built by a third-party, so I started there. I set the groups back so that only members could view group membership, and then removed the custom web parts from a page. Once this web part was removed, the “Access Denied” error did not occur, meaning that the custom code in the web part was the culprit. It apparently was written to access SharePoint groups through the SharePoint Object Model, but was using the current user’s credentials instead of using elevated privileges or impersonation.

Attack of the poorly written web part

The moral of this story is to make sure your custom SharePoint solutions use the proper privileges when appropriate. The web part cannot do any more than the permissions it is given, and oftentimes a user, even a Site Owner, does not have privileges to perform many of the actions available in the SharePoint Object Model. Properly written code can mitigate this problem, while still maintaining proper security throughout the site. Poorly written code can cause headaches for everyone, especially Site Owners.

Dealing with Workflow Failed On Start (retrying) errors in Microsoft SharePoint

I don’t know why I expect SharePoint to work without problems. In almost every project I’ve done, some unforeseeable problem pops up that seems to make little sense. Maybe, after beating my head against the wall so many times, I’ve developed SharePoint amnesia, but I always expect the project to be a pleasant experience. I guess that’s just my inner masochist rearing its ugly head.

Recently, I had a working Visual Studio workflow for SharePoint 2007 which was designed to set access permissions at the list item level based on the metadata assigned to the document. When a document is added or updated in the library, the workflow will update the permissions associated with the document. Simple stuff.

Then we had a new requirement. Since permissions are assigned base on metadata values assigned to documents, what happens when the SharePoint groups and/or users associated with a value needs to change? With the current workflow, it would be necessary to manually run the workflow for each and every document. In a small document library, this would not be a problem. But when you’re dealing with a document library containing several thousand documents, this would be a life sentence.

In order to accommodate the new requirement, I had to modify the workflow to be able to loop through all of the documents in the library and re-apply permissions. Making the change was easy enough; the code that sets the permissions on one document simply had to be wrapped in a foreach loop of all documents in the library. Problem solved, right?

Failed On Start (retrying)
It's lying!

I deployed the workflow to the test environment and started it. It ran for several minutes with no apparent problems. The workflow history list was logging all of the tracking messages for each document, and permissions were being set as expected. Then all of a sudden, the workflow status switched to “Failed on Start (retrying).” Workflow history logs stopped appearing, and the workflow was certainly not “retrying”.

After several Google searches came up largely empty, I began making changes to the code to see if I could get a different result. Nothing was working. No matter what I changed, the workflow would run for several minutes and fail again. Then I decided to add more workflow history logging to pinpoint a particular line of code that may be causing the error.

With the new workflow history logs added into the code, I redeployed and ran the workflow again. This time I noticed the workflow was much quicker to fail. It seemed that the workflow history list had something to do with the problem. To test my hypothesis, I commented out the workflow history logs and tried again. Sure enough, the workflow ran to completion, without any problems. After I verified on a random selection of documents that permissions had been set correctly, I did my happy dance. Problem solved.

So what happened?

My best explanation of what the problem could have been is that the workflow was logging too many workflow task items, which eventually caused the workflow to crash. Since the workflow was originally written to run on a single document, it was adding 5-10 workflow history logs per document, depending on the values of the metadata tied to the document. Multiply this by 5000, and you get 25,000 to 50,000 new list items per workflow instance.

It is generally recommended to keep list sizes below a few thousand list items in SharePoint 2007, so the plethora of workflow history items was essentially crushing the workflow under its own weight.

CampfireThe moral of this story is this: Logging is great for debugging and ensuring proper operation of your workflows, but don’t go overboard. It’s like they always say: “Too many logs in the fire means no more roasted marshmallows.” Or something like that.

 

 

How to Perform SharePoint Development On A Client Workstation

One of the most difficult restrictions for a SharePoint developer to deal with can be the requirement to do development on a SharePoint server.  Personally, I prefer doing my development on my local machine, eliminating the need to establish a remote desktop connection to a different machine in order to write code.

Unfortunately, SharePoint development requires many DLL files which are included with an installation of SharePoint on a server.  To make matters worse, SharePoint 2010 requires an x64 server, further complicating the issue.  Fortunately, there is an easy workaround that can allow a SharePoint developer to be productive, even while using their laptop on the road without an available internet connection.

Copy the SharePoint DLLs

As I mentioned before, SharePoint development requires DLL files that are included with a SharePoint 2007 or 2010 installation.  The first step is to grab these off of a SharePoint server.  For SharePoint 2007, they are located in the hive at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\ISAPI\, and for 2010 at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\ISAP\.  Copy the DLL files in this directory from the server, and paste them at the exact same file path on your local machine.  Since your PC likely does not have SharePoint installed, you may have to create the directory structure yourself.

SharePoint 2007 DLL Directory
SharePoint 2007 DLL Directory

Register the Assemblies to the GAC

Now that you have the DLL files on your workstation, you will be able to include them as references in your Visual Studio projects just as you would with any other DLLs.  However, if you want them to auto-register with your project when you use a Visual Studio 2010 SharePoint template or a WSPBuilder template, you must register the DLL files in your local Global Assembly Cache.  To do this, open the directory on your workstation that contains the SharePoint DLLs and drag them into the C:\Windows\assembly\ directory.  This will register them with the GAC on your workstation, and Visual Studio should successfully find the assemblies when a template is loaded up.  Although these assemblies may be 64-bit, this will work fine even though your workstation may be 32-bit.

Global Assembly Cache
Global Assembly Cache

If you’ve successfully completed the two steps above, you should be able to write your code and successfully compile your project.  Once you generate your WSP file, you can then deploy it like any other WSP.

Please Use Caution

If you do development for both 2007 and 2010, you can do this for both on the same workstation; just be sure to complete both steps for each version.  Since the 2007 and 2010 assemblies have different Assembly Versions (12.0.0.0 and 14.0.0.0), you don’t have to worry about conflicts in the GAC.  Be sure to use caution, however, because in my experience, Visual Studio tends to grab the SharePoint 2010 version of the DLL even for a SharePoint 2007 project if they’re both registered on your workstation.  If this happens, remove the incorrect reference, and add a reference to the correct 2007 DLL from your 12\ISAPI directory.

SharePoint Solution Deployment Horror

A couple of weeks ago, I had the worst professional SharePoint moment of my life, thanks to SharePoint 2007, Visual Studio 2008, and WSPBuilder. Deploying custom solutions to SharePoint is something I have done many times, but I’ve learned to expect nothing but the worst, especially when the task at hand seems like it should be easy. In the end, I had wasted far too much time performing a seemingly mundane task. My goal is to relay this experience and how I overcame it, in the hopes of saving several hours of your life.

Using Visual Studio 2008 and WSPBuilder, I had created a custom Workflow solution to deploy to a SharePoint 2007 site. The workflow was a fairly simple one, simply copying a document from a drop-off library to a document library on a different site collection, along with its associated metadata. Using a stand-alone installation of SharePoint 2007 on a virtual machine, I created a model of the production site, based on my understanding of the architecture. Then I proceeded to write a workflow to fit this design.

After writing the code and creating the .wsp, the initial deployment to production went without a hitch. However, after some quick testing, it was clear a couple minor tweaks needed to be made to the workflow; my development site did not quite match the production environment. In retrospect, I would say this is where things began to go wrong. But at the time, this seemed like no big deal. It should be easy to tweak the workflow code and redeploy it. Right?

Unfortunately, I’ve never been so wrong. After making the code changes, I created a new .wsp, retracted and removed the old solution from SharePoint, and deployed the new solution. I tested the workflow again, and, to my bewilderment, it still did not work correctly. In fact, it didn’t seem to be behaving any differently than the first time.

I went back over my code to see if I made any silly mistakes. Everything looked good as far as I could tell, and unfortunately in this situation, I could not debug my solution on the production server. My only option was to change the code and redeploy. After making minor code changes and redeploying several times, nothing I did seemed to make a difference. The workflow seemed to run exactly the same each and every time, returning what seemed like the same error in the same place.

So I began to think about it a little differently. It almost seemed to be using the original DLL in the GAC (Global Assembly Cache), even though I obviously had retracted the old one and deployed a new one. The GAC is a cache, after all. Just to be sure, I navigated to the GAC in Windows Explorer at C:\Windows\assembly, and sure enough, the DLL was not listed in there after retracting the solution. And when I deploy the new solution, the DLL would show up. It seemed to be working exactly as I would expect.

Global Assembly Cache
Windows Explorer view of the GAC

One thing I did notice was that the assembly version was still 1.0.0.0. Perhaps since I left the assembly version the same in Visual Studio, the GAC wasn’t taking the new DLL at all? To change the DLL version, I had to change it in 3 spots in Visual Studio: in the project properties pages, in the elements.xml file, and in the feature.xml file. After making this change, I packaged up the .wsp again, and deployed it, confident my nightmare was nearing its end.

This time, SharePoint did not seem to recognize my new feature at all, as I could not find it in the site collection features list to activate it, or see the workflow in the list of available workflows to add to a list. After Googling my issue, I decided that I may benefit from using STSADM exclusively for adding, deploying, and installing solutions and features. Previously, I had been simply adding it with STSADM, and then using Central Admin to deploy the solution and activate it. This approach, in my experience, has appeared to work flawlessly with SharePoint 2010, but apparently this is not the case for SharePoint 2007.

It turned out that while I had updated the solution to a new version, the feature was still looking for the old version. This is where extensive use of stsadm came in to play. Using the command prompt, I issued these commands, in order:

  • stsadm –o deactivatefeature –name –url <URL of SharePoint web application>
  • stsadm –o uninstallfeature –name <name of feature folder> –force
  • stsadm –o retractsolution –name <name of solution .wsp file> –immediate –allcontenturls
  • stsadm –o deletesolution –name <name of solution .wsp file>
  • stsadm –o addsolution –filename <file path to .wsp file>
  • stsadm –o deploysolution –name <name of solution .wsp file> –immediate –allowgacdeployment –url <URL of SharePoint web application>
  • stsadm –o installfeature –name <name of feature folder> -force
  • stsadm –o activatefeature –name <name of feature folder> –url <URL of SharePoint web application> -force

Using stsadm –o updatesolution probably would have worked as well, but that is not the route I ended up taking. Finally, I tested the workflow one last time, and it worked like a charm. After wasting many hours of my life on something that seemed so simple, I was relieved to have this nightmare behind me. Hopefully this will help prevent you from experiencing the same problem.

As I mentioned before, I probably could have saved a lot of headache if I had a better understanding of the production site’s architecture before I began development. It is important to ask the right questions in order to get the answers you need. The person you are getting the information from won’t always know all of the important details, so it is up to you, as the architect and developer, to get the information you need to get the job done.

If you want to learn more about WSPBuilder or STSADM, please check back here soon. I plan on covering both of these useful tools in the near future.

Image Courtesy: Brian Pennington

A Developer’s Guide to Troubleshooting Microsoft SharePoint Errors

Anybody that has worked with Microsoft SharePoint as a developer knows that it can be an exercise in frustration for the strangest of reasons. One only has to see a handful of helpful error messages such as “An unexpected error has occurred” to understand exactly what I mean.

Yes, one may look through the logs to find more details about the error that occurred, but in most cases this requires a Correlation ID, which in many circumstances SharePoint neglects to make known. Without this 32-digit, difficult-to-remember hexadecimal value, finding the error that occurred can be very difficult (if not impossible in some circumstances). At least SharePoint will reward a diligent developer with a stack trace for a job well done.

After 5 months of SharePoint training, development, research, and a certification, I don’t think I’ve become much better at SharePoint development. Writing the code is the easy part. What I have become better at is fixing the errors, and avoiding some errors altogether.

Dealing with SharePoint Errors

Starting out, the best thing one can do is learn to use the logs, which are buried deep within the catacombs of the Program Files directory. If SharePoint doesn’t issue a correlation ID, search for exceptions around the same time period in which the error occurred.

When you find the exception type which was thrown, Google it! If you can’t find the error in the log, Google it! At least ten thousand developers have experienced the same issue, and number of them have either recorded the solution in a blog or posted on a message board, where someone else posted the answer. When armed with knowledge, nothing can take down a .NET developer.

Keep watching this space in the future, as I plan on covering a wide range of SharePoint 2007 and 2010 topics, including user experience, administration, and development, along with other struggles I come across along the way, all of which in the hopes of keeping you from making the same mistakes as me.

In case you’re wondering, here’s where the logs are located. You can see why I didn’t put them in the body of the article:

  • SharePoint 2007: C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\LOGS\
  • SharePoint 2010: C:\Program Files\Common Files\Microsoft Shared\web server extensions\14\LOGS\