Solved

Email Management package rule working for one test email and not for other

  • 29 September 2021
  • 3 replies
  • 29 views

Userlevel 2
Badge

Hi All,

I was trying to customize the open source Email management package to my outlook email needs. The problem I am getting is the extraction rules are working perfectly fine one test email in the expert.ai studio. But the same rules are not extracting information for other test emails.

 

Sample Example scenario:

TestEmail1.txt with just from address

From: Venkat Praveen Tatini <venkat@xyz.com>

I was able to extract the name [Venkat Praveen Tatini] from the TestEmail1.txt 

 

TestEmail2.txt with just from address

From: Sherlock Holmes <sherlock@xyz.com>

I was unable to extract the name [Sherlock Holmes] from the TestEmail2.txt

This is weird because the rule is pretty simple which is extract people name. I looked at the Original package and compared with my custom package. Haven’t missed anything. but unable to figure out why the extraction rule that works for one TestEmail1.txt is not working on TestEmail2.txt.

It would be extremely helpful for me if anyone knows what the issue might be?

Looking forward to the community response

 

Cheers,

venkat.

icon

Best answer by lmusetti 1 October 2021, 12:54

View original

3 replies

Userlevel 1
Badge

Hi Venkat! Please, post your rule here and let’s see. Is Sherlock Holmes disambiguated as a person name in the Semantic Analysis panel?

 

Userlevel 2
Badge

Hi Lorenzo,

Thanks for getting back.

From: Venkat Praveen Tatini <venkat@xyz.com> 

This is TestEmail1.txt

So the rule i have written is :

SCOPE SENTENCE
{
    //Extraction of PEOPLE
    IDENTIFY(Data)
    {
        KEYWORD("From:")
        >>
        @EmailSenderName[ANCESTOR(100215627)+TYPE(NPH)]|[SEQUENCE]//@SYN: #100215627# [person]
    }
}

The issue i am getting here is for the TestEmail1.txt its taking the entire name which is Venkat Praveen Tatini. which is correct for the From rule.

Here Venkat Praveen Tatini is disambiguated as a person in semantic analysis for TestEmail1.txt
 

To: Venkat Praveen Tatini <venkat@abc.com> 

So the rule I have written is :

SCOPE SENTENCE
{
    //Extraction of PEOPLE
    IDENTIFY(Data)
    {
        KEYWORD("To:")
        >>
        @EmailReceiverName[ANCESTOR(100215627)+TYPE(NPH)]|[SEQUENCE]//@SYN: #100215627# [person]
    }
}

The issue I am getting here is for the TestEmail2.txt its not taking the entire name which is Venkat Praveen Tatini. the extraction only gives me Venkat.

Here Venkat Praveen Tatini is not disambiguated as a person in semantic analysis for TestEmail2.txt.

Now Lorenzo I completely understood that we only get extractions that are shown as a name in the semantic analysis. But one thing that I am confused is if the rule is working for From: why its not working for To:

Thats why I love this Expert.ai language because its interesting and challenging as well.

Also, If I may ask could you please reply to me in private message regarding this issue so that I can discuss it further.

Also if you can reply to my emails in your leisure time that would be great.

Sorry if i ask a lot. But without experience with this language it’s tough. I hope you guys understand my situation here.

Cheers,

Venkat.
 


 

Userlevel 1
Badge

Hi Venkat,

I would suggest removing the ANCESTOR as the TYPE(NPH) by it self is enough for pulling out people names by using the technology’s named entity recognition capabilities. I would also suggest substituting >> with <:2> to slightly increase the number of tokens separating “From:” and “To:” to the person name. Also, removing SEQUENCE allows for extracting and normalizing only the person name so that you’re not also picking up “To:” and “From:” with the person name extraction.

Examples shown below

SCOPE SENTENCE

{

    //Extraction of PEOPLE

    IDENTIFY(Data)

    {

        KEYWORD("From:")

        <:2>

        @EmailSenderName[TYPE(NPH)]

    }

}

 

SCOPE SENTENCE

{

    //Extraction of PEOPLE

    IDENTIFY(Data)

    {

        KEYWORD("To:")

        <:2>

        @EmailSenderName[TYPE(NPH)]

    }

}

note: this second rule extracts all people names preceeded by “To:” as EmailSendeName. Shouldn’t it be an EmailRecipientName?

 

You could think also of using TEXT instead of SEQUENCE to bypass name normalization and pull out people names as they’re shown in the text. See an example below

 

SCOPE SENTENCE

{

    //Extraction of PEOPLE

    IDENTIFY(Data)

    {

        KEYWORD("From:")

        <:2>

        @EmailSenderName[TYPE(NPH)]|[TEXT]

    }

}

 

All of the above suggestions will increase your rules recall in extracting sender and/or recipient names. If the named entity recognition in TestEmail2.txt doesn’t disambiguate Venkat Praveen Tatini as a whole person name, I’d suggest using tagging rules to recondition disambiguation and make Venkat Praveen Tatini a whole person name or to use composition to compose the whole name one token at the time.

 

Hope this helps!

Reply