Asked  1 Year ago    Answers:  5   Viewed   8 times

I have an email address that could either be $email = "x@example.com"; or $email="Johnny <x@example.com>" I want to get $handle = "x"; for either version of the $email. How can this be done in PHP (assuming regex). I'm not so good at regex.

Thanks in advance

 Answers

5

Use the regex <?([^<]+?)@ then get the result from $matches[1].

Here's what it does:

  • <? matches an optional <.
  • [^<]+? does a non-greedy match of one or more characters that are not ^ or <.
  • @ matches the @ in the email address.

A non-greedy match makes the resulting match the shortest necessary for the regex to match. This prevents running past the @.

Rubular: http://www.rubular.com/r/bntNa8YVZt

Saturday, May 29, 2021
 
2

I think you'd be better going down the email line at a time as it's the line breaks that are more critical in e-mail formation.

Your rules would be:

  • If you get a double line break, then the body is starting - plain text type (as there are no headers to indicate which).
  • Otherwise, carry on until you get the "boundary=" bit, and then you record the boundary and hop into a "looking for boundary" mode.
  • Then, when you find a boundary, hop into "Looking for content-type or double new-line" mode, and look for Content-Type (and note content-Type) or double new-line (header has finished, body coming next until the next boundary)
  • While reading the body of the message, you're back in "looking for boundary" mode to repeat teh process.

Something I remember from a long time ago - so the following may not be 100% accurate, but I'll mention just in case. Be careful with files with attachemnts as you can get two "boundary" markers. But one boundary is withing another boundary, so if you follow the rules above (i.e. grab the first boundary and stick with it) then you should be fine. But test your script with some attachemnts :)


Edit: additional info as asked in the question. An e-mail can have as many "bodies" as the user wishes to encode. You can have a plain, and HTML, a UTF encoded version, and RTF version or even a Morse Code version (if the client knew how to handle "Content-Type Morse/Code"!). Sometimes you don't get plain text, but only HTML versions (naughty users). Sometimes the HTML actually comes without the content type declaration (which may or may not get displayed as HTML, depending on the client). The boundary also splits off the attachments. Rich test is a gotcha from Outlook (although, to be fair, it usually IS converted to HTML). So no, there's somewhere between 0 and X bodies.

Saturday, May 29, 2021
 
5

Using regular expression to validate emails is tricky

Try the following email as an input to your regex ie:^[w.-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,6}$

abc@b...com

You can read more about email regex validation at http://www.regular-expressions.info/email.html

If you are doing this for an app then use email validation by sending an email to the address provided rather than using very complex regex.

Saturday, May 29, 2021
 
Jauco
 
5

A regular expression is quite heavy machinery for this purpose. Just split the string containing the email address at the @ character, and take the second half. (An email address is guaranteed to contain only one @ character.)

Sunday, August 15, 2021
 
pop
 
pop
3

at face value, the following will work:

preg_match('~<([^>]+)>~',$string,$match);

but i have a sneaking suspicion you need something better than that. There are a ton of "official" email regex patterns out there, and you should be a bit more specific about the context and rules of the match.

Monday, August 16, 2021
 
Taha
 
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :