Diagnosing the Unknown Alias problem in SharePoint

My frustraition with a product that is, in theory, action packed, but faultering in execution continues - this time it is incoming email. A well behaved mail gateway should return a non-delivery receipt if it is unable to deliver email, it's common courtesy. SharePoint, both Windows SharePoint Services and Office SharePoint Service, seems to take exception to this idea. As an IT guy it is the most frustraiting thing to see a module that was slapped together with little regard to proper protocol behaviour. My frustration started when email enabled lists just stopped capturing email randomly. I am not alone in this frustration either, there are clearly others yet not a peep from Microsoft:

The SPTimer service polls the SMTP drop folder on a regular basis to look for incoming messages. When the message is picked up it is read in and the alias appears to be cross referenced against an EmailEnabledLists table. I was tipped off to this by a post on the SharePoint newsgroup. If the logging verbosity is set and the alias is not found it will log an Unknown Alias error to both the event log and the trace logs:

Type:		Warning
Date:		2/26/2008
Time:		8:42:30 AM
Event:		6873
Source:		Windows SharePoint Services 3
Category:	E-Mail
User:		N/A
Description:
An error occurred while processing the incoming e-mail file
D:\SMTP\Drop\54f3fb0601c8787d0000024f.eml. The error was: Unknown alias..

Along with the following event:

Type:		Information
Date:		2/26/2008
Time:		8:42:30 AM
Event:		6871
Source:		Windows SharePoint Services 3
Category:	E-Mail
User:		N/A
Description:
The Incoming E-Mail service has completed a batch.  The elapsed time was 00:00:00.0156256.
The service processed 1 message(s) in total.


Errors occurred processing 1 message(s):

Message ID: <SERVERxiDd9D9DQVXwR100000a19@domain>

 The following aliases were unknown:
 mylist

After poking around in the database to see if I could find any hints as to why this functionality decided to stop all of a sudden. Running the following query on each content database gave me a dump of the email aliases that each list was configured against:

SELECT 	    tp_EmailAlias as EmailAlias,
            SiteId,
            tp_WebId AS WebId,
            tp_Id as ListId
FROM        AllLists
INNER JOIN  Webs ON AllLists.tp_WebId = Webs.Id
WHERE       tp_EmailAlias IS NOT NULL

Cross referencing the results with the EmailEnabledLists table I was able to find that there were some, no, a lot of missing rows. Something was deleting my email enabled lists. At this point I don't know what, but it smells like a bug. In the mean time I have created a script to re-populate the configuration database table based on the content databases.

SET NOCOUNT ON
IF OBJECT_ID('tempdb..#EmailEnabledLists') IS NOT NULL
BEGIN
   DROP TABLE #EmailEnabledLists
END

CREATE TABLE #EmailEnabledLists (
    Alias   NVARCHAR(128) NOT NULL,
    SiteId  UNIQUEIDENTIFIER NOT NULL,
    WebId   UNIQUEIDENTIFIER NOT NULL,
    ListId  UNIQUEIDENTIFIER NOT NULL)

-- TODO: Insert your SharePoint Content Database Names into the Set
DECLARE DatabaseCursor CURSOR FOR
    SELECT  [Name]
    FROM    master.[sys].databases
    WHERE   [State] = 0 AND -- State = Online
            [Name] IN ('SharePoint_ContentDatabase1', 'SharePoint_ContentDatabase2')
    ORDER BY [Name]
OPEN DatabaseCursor

DECLARE @DatabaseName VARCHAR(128)
FETCH NEXT FROM DatabaseCursor INTO @DatabaseName
WHILE @@FETCH_STATUS = 0
BEGIN

    DECLARE @Command NVARCHAR(4000)
    SET @Command =
        N'USE [' + @DatabaseName + '];' + 
        N'INSERT INTO #EmailEnabledLists (Alias, SiteId, WebId, ListId) ' +
        N'SELECT  ' +
        N'    tp_EmailAlias as Alias, ' +
        N'    SiteId, ' +
        N'    tp_WebId AS WebId, ' +
        N'    tp_Id as ListId ' +
        N'FROM ' +
        N'    AllLists ' +
        N'    INNER JOIN  ' +
        N'        Webs ' +
        N'    ON      AllLists.tp_WebId = Webs.Id ' +
        N'    WHERE   tp_EmailAlias IS NOT NULL'
    EXEC sp_executesql @Command
    
    FETCH NEXT FROM DatabaseCursor INTO @DatabaseName
END
CLOSE DatabaseCursor
DEALLOCATE DatabaseCursor

-- TODO: Insert your SharePoint Configuration Database Name
USE [SharePoint_Config]

DECLARE AliasCursor CURSOR FOR
    SELECT  [Alias]
    FROM    #EmailEnabledLists
    ORDER BY [Alias]
OPEN AliasCursor

DECLARE @AliasName NVARCHAR(128)
FETCH NEXT FROM AliasCursor INTO @AliasName
WHILE @@FETCH_STATUS = 0
BEGIN
    IF(NOT EXISTS (SELECT Alias FROM dbo.EmailEnabledLists WHERE Alias = @AliasName))
    BEGIN
        DECLARE @SiteId UNIQUEIDENTIFIER
        DECLARE @WebId UNIQUEIDENTIFIER
        DECLARE @ListId UNIQUEIDENTIFIER

        SELECT  @SiteId = SiteId,
                @WebId = WebId,
                @ListId = ListId
        FROM    #EmailEnabledLists

        PRINT 'Inserting ' + @AliasName
        INSERT  dbo.EmailEnabledLists
                (Alias, SiteId, WebId, ListId, [Deleted])
        VALUES  (@AliasName, @SiteId, @WebId, @ListId, 'FALSE')
    END

    FETCH NEXT FROM AliasCursor INTO @AliasName
END
CLOSE AliasCursor
DEALLOCATE AliasCursor

IF OBJECT_ID('tempdb..#EmailEnabledLists') IS NOT NULL
BEGIN
   DROP TABLE #EmailEnabledLists
END

Now the million dollar question - what is the root cause? If I had time I would setup a trigger on the EmailEnabledLists table to notify me when something is deleted to get an idea of time frame and then use SQL Profiler to watch that process to see what the calls looks like and where they are coming from. Better yet, maybe some folks from the SharePoint team can chime in and share some more to help us understand the quirks of a product struggling for stability in the face of mass adoption.

UPDATE (24-Mar-08): Linked to my TechNet forums post as well, but no further progress. Looking into the Microsoft.SharePoint.Administration.SPEmailEngine class with Reflector I cannot see any code to generate NDRs either.

Comments Subscribe to Post Comments Feed

Todd Klindt said:

Good investigation work.  I'm curious to see if we get an official response.  I can't reproduce the issue or I'd burn a PSS call trying to figure it out.

tk

Jay said:

I have had the same problem exactly. I find it ridiculous that the system just forgets the aliases. I also can't believe that WSS would not have a catch all for incoming email with no alias. The emails just disappear. That's a great policy. The whole thing has lead me to deliver messages to both WSS and the public folders WSS is meant to replace. Unbelievable!

John said:

Not sure if this is exactly the same thing, but I have a problem with MOSS publishing sites. We frequently have a case where our user in the Approvers group for a site collection doesn't get notified when a page has been Submitted For Approval. When we check the user's MySite profile, their Work E-mail field is blank. We have also seen the same scenario with the user who submitted the page for approval not receiving a confirmation email when the page gets approved.

Right now our fix is to remove and re-add the user to the Approvers group.

John

Peter Seale said:

Thanks for taking the time to gather all this up. We're experiencing the same problem and have opened a case; I'm going to forward them to this page to summarily describe what is happening. I'll let you know how this works out.

Patrick said:

Colin,

   Glad i read your blog.I have done i could to configure WSS list o receive incoming mails but NO SUCCESS.

I keep seeing this in WSS LOG:

05/02/2008 14:01:42.09 OWSTIMER.EXE (0x1324)                   0x1300 Windows SharePoint Services   E-Mail                         6871 Information The Incoming E-Mail service has completed a batch.  The elapsed time was 00:00:00.  The service processed 0 message(s) in total.

I can see my emails in my drop folder for some seconds and then  dissapears..

MS Share point  Team where are you??

If you have any solutions for this please email me to : naijacoder@hotmail.com

Thanks Colin

Kurt said:

Great work here.  I used the script above to re-insert the mysteriously missing entries from EmailEnabledLists.  However, when I recycled the timer, and it processed the very next one, the entry was once-again nuked from the EmailEnabledLists table.  This is bad...

Any traction anywhere on this???

m@ said:

I just wanted to pass on  that we are experiencing the same issue (Random and usually discovered by the user community)

Our work around has been un-checking and re-checking the “Overwrite files with the same name” flag in the non-working library (Email Settings).

We are not sure why this flag reset gets the mail “receiving” going again for that library but it does.

Any additional information on the culprit would grateful.

Great blog BTW!

Geoff said:

Great diagnostic work. This is exactly what has just happened to us. There seems to be little available from Microsoft about this so I'm gonna log a ticket. THey can't ignore us forever, right?

Have Your Say