|
Home > Archive > Unix Shell > May 2007 > Anonymization
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
|
| Janis Papanagnou 2007-05-25, 7:17 pm |
| Clement wrote:
> Hi,
> I have a file which looks as follows:
> [**] [122:17:0] (portscan) UDP Portscan [**]
> 03/05-22:04:55.963641 85.181.42.174 -> 192.168.123.179
> PROTO255 TTL:0 TOS:0xC0 ID:21124 IpLen:20 DgmLen:171
>
> [**] [1:1384:9] MISC UPnP malformed advertisement [**]
> [Classification: Misc Attack] [Priority: 2]
> 03/05-22:05:09.030411 192.168.123.254:4727 -> 239.255.255.250:1900
> UDP TTL:64 TOS:0x0 ID:24964 IpLen:20 DgmLen:298
> Len: 270
> [Xref => http://www.microsoft.com/technet/se...n/MS01-059.mspx][Xref
> => http://cgi.nessus.org/plugins/dump.php3?id=10829][Xref =>
> http://cve.mitre.org/cgi-bin/cvename.cgi?name=2001-0877][Xref =>
> http://cve.mitre.org/cgi-bin/cvename.cgi?name=2001-0876][Xref =>
> http://www.securityfocus.com/bid/3723]
>
> I would like to anonymize it by changing the IP addresses with random
> addresses but the modification has to be the same. For example if the
> script changes 192.168.1.0 by 123.124.125.126, if there is another
> 192.168.0.1 it has to be changed by 123.124.125.126 again.
The key to this is to store the new random IP address in a map table
indexed by the original IP address. If a matched address is in the
table take that to replace the original one otherwise generate one
randomly and store it.
> And I would like to change the web addresses by ***
> example: http://www.google.com becomes http://**************
That is a simple substitution of the characters in the substring that
matches a http address. (BTW, no "https://" possible?)
>
> It would be really great if somebody could help me.
What help do you exactly need? A suggestion how to solve it?
What tools to use? A complete worked out and tested program?
Have you tried anything yourself?
I would suggest to write an awk script that does the following
for each record
for each http address in the record
substitute http address characters by stars
for each IP address in the record
if the IP address is in a map table
new IP address is content of map table
else
generate new random IP address
add random IP address to map table
substitute new IP address
print the unmodified or modified record
If you need some concrete help, tell us.
Janis
> Thank you very much
>
| |
| Clement 2007-05-26, 1:24 pm |
| On 25 May, 21:17, Janis Papanagnou <Janis_Papanag...@hotmail.com>
wrote:[vbcol=seagreen]
> Clement wrote:
>
>
>
> The key to this is to store the new random IP address in a map table
> indexed by the original IP address. If a matched address is in the
> table take that to replace the original one otherwise generate one
> randomly and store it.
>
>
> That is a simple substitution of the characters in the substring that
> matches a http address. (BTW, no "https://" possible?)
>
>
>
>
> What help do you exactly need? A suggestion how to solve it?
> What tools to use? A complete worked out and tested program?
> Have you tried anything yourself?
>
> I would suggest to write an awk script that does the following
>
> for each record
> for each http address in the record
> substitute http address characters by stars
> for each IP address in the record
> if the IP address is in a map table
> new IP address is content of map table
> else
> generate new random IP address
> add random IP address to map table
> substitute new IP address
> print the unmodified or modified record
>
> If you need some concrete help, tell us.
>
> Janis
>
well actually I don't know anything about script shell so I'm afraid
it's gonna take me ages to solve it. So it seems so easy for someone
then I would be happy to have a complete worked out and tested
program But I don't know if I can ask for that...
Thanks
| |
| Janis Papanagnou 2007-05-26, 1:24 pm |
| Clement wrote:
> On 25 May, 21:17, Janis Papanagnou <Janis_Papanag...@hotmail.com>
> wrote:
>
>
>
> well actually I don't know anything about script shell so I'm afraid
> it's gonna take me ages to solve it. So it seems so easy for someone
> then I would be happy to have a complete worked out and tested
> program But I don't know if I can ask for that...
Okay, it's not an uninteresting task; and it might even be generally
helpful to contribute something to support privacy.
The following code reflects the above outlined algorithm. There are
a few simplifications done (e.g. the http addresses are assumed to
be followed by a ']' closing bracket (as in your data); if that's
not appropriate one has to adjust the definition of the HTTP regular
expression below)[*]. The code is a quick hack and might likely be
improved, but I am too lazy at the moment (after all it's weekend ;-)
To run the code below put it in a file called anon.awk and call it
as awk -f anon.awk your_log_files...
I've briefly tested the code, but if you find any problems inform me
about any faults. If you have further questions feel free to ask.
Janis
[*] Off the top of my head I don't know what characters are allowed
in URL's (I suppose [-_A-Za-z0-9:/%\.?=], what else?); I'll have to
look it up occasionally to fix the HTTP definition.
---<snip>---
#!/bin/awk -f
# anon.awk - Anonymize IP addresses and http addresses.
#
# The code is released under the GNU General Public Licence.
# Janis Papanagnou, Mai 2007
BEGIN { srand ()
IP_part = "[0-2]?[0-9]?[0-9]" # a simplified pattern
RE_dot = "\\."
IP = IP_part RE_dot IP_part RE_dot IP_part RE_dot IP_part
HTTP = "http://.*\\]" # a simplified pattern (expecting a final "]")
# TODO: fix HTTP definition
}
$0 ~ HTTP {
line = $0 ; out = ""
while (match (line, HTTP))
{
head = substr (line, 1, RSTART-1)
addr = substr (line, RSTART, RLENGTH)
tail = substr (line, RSTART+RLENGTH)
gsub (/./, "*", addr)
out = out head "http://" substr (addr, 9) "]"
line = tail
}
$0 = out line
}
$0 ~ IP {
line = $0 ; out = ""
while (match (line, IP))
{
head = substr (line, 1, RSTART-1)
addr = substr (line, RSTART, RLENGTH)
tail = substr (line, RSTART+RLENGTH)
if (! (addr in map))
map[addr] = r256() "." r256() "." r256() "." r256()
# we don't care about valid sets of IP addresses and use
# four random numbers [0..255]
out = out head map[addr]
line = tail
}
$0 = out line
}
{ print }
function r256 ()
{
return int(rand()*256)
}
---<snip>---
> Thanks
>
|
|
|
|
|