Automatic /r stripping by hashcat?
#1
Hello,
I have not found it documented that hashcat automatically strips /r (CR) characters but it seems to be doing so.  I have been going under my apparently old understanding that, when present, /r needed to be removed manually (dos2unix, or similar).

The steps I used to test this are:
1. I created two wordlists, nixlist and winlist with the same words.  On each line in winlist I manually added carriage returns (/r) using CTRL-V CTRL-M in vim (I didn't have a Windows computer handy to make the file).
2. I confirmed the wordlist file structure using hexdump.
3. I manually hashed nixlist and winlist with openssl to confirm the /r was causing hashes to be different.
4. I then created a hashlist by hashing nixlist and saved it to a file named hashlist
5. Next I ran hashcat on hashlist using winlist as the wordlist.  Before doing so I made sure hashcat.potfile was empty and I also manually set the poftile to local.potfile
 
I was surprised to see that the winlist wordlist successfully recovered passwords in the nixlist-based hashlist.  This suggests that hashcat is stripping /r.
Can someone confirm this or specify my error?  If it is documented somewhere, can you point me to that link?


Command output:
Quote:cat winlist
hello
goodbye
lucky
snake
Quote:hexdump -c winlist
0000000  h  e  l  l  o  \r  \n  g  o  o  d  b  y  e  \r  \n
0000010  l  u  c  k  y  \r  \n  s  n  a  k  e  \r  \n  \n
000001f
Quote:for word in $(cat winlist); do echo -n "$word" | openssl sha1; done | awk '{print $2}'
e5ad4d3134d03e6bfc4de4f046c7c5d0b52962a5
ad928c1e055bbb0858c452b0d43b3740e53adc31
913e3490a7bf1ad10957f3073c8ea7e02f85bda0
9316338a5ff32b8172cb80d6b92dd6e8708ce46e
Quote:cat nixlist
hello
goodbye
lucky
snake
Quote:hexdump -c nixlist
0000000  h  e  l  l  o  \n  g  o  o  d  b  y  e  \n  l  u
0000010  c  k  y  \n  s  n  a  k  e  \n
000001a
Quote:for word in $(cat nixlist); do echo -n "$word" | openssl sha1; done | awk '{print $2}'
aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d
3c8ec4874488f6090a157b014ce3397ca8e06d4f
1ce1416347075b6070a35ce5e9d26b61d91ea6c3
148627088915c721ccebb4c611b859031037e6ad
Quote:hashcat -m 100 -a 0 --potfile-path=local.potfile hashlist winlist
hashcat (v6.2.6) starting
<--truncated-->
Approaching final keyspace - workload adjusted.         
aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d:hello           
3c8ec4874488f6090a157b014ce3397ca8e06d4f:goodbye         
1ce1416347075b6070a35ce5e9d26b61d91ea6c3:lucky           
148627088915c721ccebb4c611b859031037e6ad:snake           
                                                         
Session..........: hashcat
Status...........: Cracked
Hash.Mode........: 100 (SHA1)
Hash.Target......: hashlist
Time.Started.....: Fri May 10 08:57:18 2024 (0 secs)
Time.Estimated...: Fri May 10 08:57:18 2024 (0 secs)
Kernel.Feature...: Pure Kernel
Guess.Base.......: File (winlist)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:    4682 H/s (0.03ms) @ Accel:1024 Loops:1 Thr:64 Vec:1
Recovered........: 4/4 (100.00%) Digests (total), 4/4 (100.00%) Digests (new)
Progress.........: 5/5 (100.00%)
Rejected.........: 0/5 (0.00%)
Restore.Point....: 0/5 (0.00%)
Restore.Sub.#1...: Salt:0 Amplifier:0-1 Iteration:0-1
Candidate.Engine.: Device Generator
Candidates.#1....: hello ->
Hardware.Mon.#1..: Util: 89%
Started: Fri May 10 08:57:18 2024
Stopped: Fri May 10 08:57:19 2024
Quote:cat local.potfile
aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d:hello
3c8ec4874488f6090a157b014ce3397ca8e06d4f:goodbye
1ce1416347075b6070a35ce5e9d26b61d91ea6c3:lucky
148627088915c721ccebb4c611b859031037e6ad:snake

Thank you for any help you can offer.
Reply
#2
Almost all programs accept both \n and \r\n newlines. It's not that it's stripping them, it's just reading the line and accepting \r\n as the newline. Usually, \n is Linux and \r\n is Windows
Reply
#3
Hi penguinkeeper, I do not think it is about acceptance.  I am content that /r/n and /n do not cause issues in almost all programs.  But they do cause issues in some programs.  Tools like shasum, md5sum, and openssl include them in their operations which results in a different hash for what humans see as the same word.  I was under the impression that hashcat did as well.  My testing yesterday and today suggest otherwise.

If /r is there and hashcat is not including it when hashing the word then hashcat is "stripping" it (or some word that has the equivalent end result).  How would you describe it if /r is part of the line but not considered by hashcat when hashing the word?
Reply
#4
In the case of the \r being embedded inside a plain like a\rb, Hashcat would find it as it would any other, just as long as it's not at the end as then it'd be considered as a multibyte newline


Code:
$ echo -en "a\rb" | md5sum
2132b3bda00d60e52785208164bff1c8  -
$ echo -en "a\rb" | ./hashcat.exe -m 0 2132b3bda00d60e52785208164bff1c8 --potfile-disable
2132b3bda00d60e52785208164bff1c8:$HEX[610d62]

"610d62" being the hexadecimal representation of a \r b
Reply
#5
I agree with what you are saying.  But in the command output below you can see that:
A. The two wordlist files have the same words but hash to different values because of the presence of \r\n  (hex 0d 0a) vs \n (hex 0a) at the end of each word.
B. When I use the words in windowslist.txt in hashcat they are successful in recovering the hashes that were made using the other wordlist (linuxwords.txt).
C. Using openssl or shasum the words in windowslist.txt hash to a different value that what hashcat successfully recovers.  I am trying to understand why.  Hashcat is doing something to/with the windowslist.txt wordlist that causes the words it contains to hash to different values than I get when using shasum or openssl to hash the list.  The end result is the passwords are recovered (yay!) but the behavior is not what I expected.

# linuxwords.txt and windowswords.txt look the same

Code:
❯ cat linuxwords.txt
hello
goodbye
lucky
snake

❯ cat windowswords.txt
hello
goodbye
lucky
snake

# hexdump and file commands show they are different (\r\n vs \n)

❯ hexdump -c linuxwords.txt
0000000  h  e  l  l  o  \n  g  o  o  d  b  y  e  \n  l  u
0000010  c  k  y  \n  s  n  a  k  e  \n  \n
000001b

❯ hexdump -c windowswords.txt
0000000  h  e  l  l  o  \r  \n  g  o  o  d  b  y  e  \r  \n
0000010  l  u  c  k  y  \r  \n  s  n  a  k  e  \r  \n
000001e

❯ hexdump -C linuxwords.txt
00000000  68 65 6c 6c 6f 0a 67 6f  6f 64 62 79 65 0a 6c 75  |hello.goodbye.lu|
00000010  63 6b 79 0a 73 6e 61 6b  65 0a 0a                |cky.snake..|
0000001b

❯ hexdump -C windowswords.txt
00000000  68 65 6c 6c 6f 0d 0a 67  6f 6f 64 62 79 65 0d 0a  |hello..goodbye..|
00000010  6c 75 63 6b 79 0d 0a 73  6e 61 6b 65 0d 0a        |lucky..snake..|
0000001e

❯ file windowswords.txt
windowswords.txt: ASCII text, with CRLF line terminators

❯ file linuxwords.txt
linuxwords.txt: ASCII text

# linuxwords.txt words hash to a different value than the windowswords.txt.  I expected this.

❯ for word in $(cat linuxwords.txt); do echo -n $word | shasum; done | awk '{print $1}' | tee linuxhashes
aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d
3c8ec4874488f6090a157b014ce3397ca8e06d4f
1ce1416347075b6070a35ce5e9d26b61d91ea6c3
148627088915c721ccebb4c611b859031037e6ad

❯ for word in $(cat windowswords.txt); do echo -n $word | shasum; done | awk '{print $1}'
e5ad4d3134d03e6bfc4de4f046c7c5d0b52962a5
ad928c1e055bbb0858c452b0d43b3740e53adc31
913e3490a7bf1ad10957f3073c8ea7e02f85bda0
9316338a5ff32b8172cb80d6b92dd6e8708ce46e

# hashcat recovers the hashes made using linuxlist.txt using the windowsword.txt file as the dictionary.  windowslist.txt and words linuxlist.txt words do not hash to the same values but hashcat is still successful.  I did not expect this and this is what I am trying to understand.

❯ hashcat -a 0 -m 100 linuxhashes windowswords.txt --potfile-disable
hashcat (v6.2.6) starting

<--truncated-->

aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d:hello
1ce1416347075b6070a35ce5e9d26b61d91ea6c3:lucky
148627088915c721ccebb4c611b859031037e6ad:snake
3c8ec4874488f6090a157b014ce3397ca8e06d4f:goodbye

<<--truncated-->
Reply