JafSoft Support Forums  
  Products:
AscToHTM (text to HTML) / AscToPDF (text to PDF) / AscToRTF (text to RTF) / Detagger (HTML to text and markup removal) 

 
  Forum options:
Forum Index  Register  Login  Search  FAQ  Log Out
Member options:
My Profile  Inbox  Member List  Address Book  My Subscription  My Forums 
 
 

Note: Some forums require a login other than "Guest" in order to post messages and replies


Added Characters in DeTagger output...

 
Logged in as: Guest
Users viewing this topic: none
  Printable Version
All Forums > [Public forums (moderated)] > Ask JafSoft > Added Characters in DeTagger output... Page: [1]
Login
Message << Older Topic   Newer Topic >
Added Characters in DeTagger output... - 7/18/2007 9:05:22 PM   
Guest
I am running 2.4.0.4 of DeTagger and have the following strange behaviour:
In source file:
<p class=MsoNormal style='line-height:normal'><b><span style='font-size:12.0pt;
font-family:"Times New Roman"'>Contact Information:</span></b><span
style='font-size:12.0pt;font-family:"Times New Roman"'> <br>
Janet A. Wright, Commission Counsel <br>
Margaret O'Neil Building <br>
410 Federal Street, Suite 3 <br>
Dover, DE 19901 <br>
Telephone: (302) 739-2399 <br>

In Output file:
<p><b>Contact Information:</b> <br />aJanet A. Wright, Commission Counsel <br />aMargaret O'Neil Building <br />1410 Federal Street, Suite 3 <br />oDover, DE 19901 <br />eTelephone: (302) 739-2399

Note all the SPURIOUS first characters in the lines...
I have a registered version of the App, and can send you the Policy File...
  Post #: 1
RE: Added Characters in DeTagger output... - 7/18/2007 11:53:14 PM   
Jaf

 

Posts: 70
Joined: 2/1/2006
Status: offline
My guess would be that the spurious extra characters may be down to the presence of Unicode in the input, or of HTML entities that can only be represented by Unicode in the output.  However it's almost impossible to determine this precisely in a bulletin board post such as this.

If you could email me a sample file at jaf <at> jafsoft <dot> com, I'd be happy to look into this further.  send any file as a .zip file attachment (if possible) to preserve the file in it's original state as far as possible.

There are options to try ANSI alternatives instead of Unicode, you could try those.

Unicode handling has been greatly improved in developments made since the official 2.4 release.  As a registered user you are entitled to access the "Early Adopter" releases that have this new functionality, ahead of the next official release.

The latest version also includes an option to completely suppress Unicode form the output.  Again, contact me via email if you'd like access to the improved version.

Cheers, Jaf

(in reply to Guest)
Post #: 2
Page:   [1]
All Forums > [Public forums (moderated)] > Ask JafSoft > Added Characters in DeTagger output... Page: [1]
Jump to:





New Messages No New Messages
Hot Topic w/ New Messages Hot Topic w/o New Messages
Locked w/ New Messages Locked w/o New Messages
 Post New Thread
 Reply to Message
 Post New Poll
 Submit Vote
 Delete My Own Post
 Delete My Own Thread
 Rate Posts