Saturday, August 19, 2006

Awking Mysql

How do you perform awk type functions in Mysql?

Say you have a list of urls and you want to find the most common domain names hosting gif files.

http://www.fileden.com/123/foo.gif
http://h1.ripway.com/11.gif
http://www.hostmyfile.info/22/web.gif
http://www.fileden.com/24/fun.gif

If this list is in a file you can use standard linux commands to find the most common domains.

grep '\.mp3' my_pictures |awk -F'/' '{print $3}' |sort |uniq -c|sort -nr


If the list is in a mysql database you can use locate() and substring() to accomplish the same thing!

SELECT count(*) as count, substring(url,1,locate('/',url,9)) as f FROM `my_pictures` WHERE url IS NOT NULL and url !='' and filename like '%.mp3' group by f order by count desc

Happy Awking!

Tuesday, July 11, 2006

phpMyAdmin #1045 - Access denied for user 'root'@'localhost' (using password: NO)

When installing phpMyAdmin the most common error is not being able to connect to the database after installation.



#1045 - Access denied for user 'root'@'localhost' (using password: NO)

Here is the solution:
Make sure your config file in the phpMyAdmin directory is named "config.inc.php"!

In the installation instuctions it says to copy config.default.php into the phpMyAdmin directory and then customize it (usually the only change needed is to change the 'auth_type' is changed from 'config' to 'http'). Make sure to rename config.default.php to config.inc.php.

Tuesday, July 04, 2006

Get the Latidude and Longitude from an IP

ip2location shell script

for k in `cat ip_list`;
do
id=`echo $k|awk -F_ '{print $1}'`;
ip_num=`echo $k|awk -F_ '{print $2}'|awk -F. '{ printf "%10d\n",$1 * 16777216 + $2 * 65536 + $3 * 256 + $4 }'`;
echo id=$id;
echo ip_num=$ip_num;
echo "SELECT ip_latitude,ip_longitude,ip_city FROM ip2location WHERE (ip_from <= $ip_num) AND (ip_to >= $ip_num) LIMIT 1"|mysql -u doug crostel_store;
done


UPDATE my_space_users u,
ip2location i SET u.user_location = i.ip_city WHERE (

ip_from <=1091213045
) AND (
ip_to >=1091213045
) LIMIT 2

mysql> update my_space_users u, ip2location l set u.ip_country=l.country_short, u.ip_region=l.ip_region, u.ip_city=l.ip_city, u.ip_latitude=l.ip_latitude, u.ip_longitude=l.ip_longitude, u.ip_isp=l.ip_isp, u.ip_domain=l.ip_domain WHERE (l.ip_from <= u.reg_ip_num) AND (l.ip_to >= u.reg_ip_num);

Thursday, April 13, 2006

Using Wikipedia as a natural language corpus

1) download the english yahoo abstracts from

http://en.wikipedia.org/wiki/Sort-Merge_Join

wiki.txt
Anarchism is derived from the Greek αναÏÏία ("without archons (ruler, chief, king)"). Anarchism as a political philosophy, is the belief that rulers, governments, and hierarchal social relationships are unnecessary and should be abolished, although there are differing interpretations of what this means.
|
Albedo is the measure of reflectivity of a surface or body. It is the ratio of electromagnetic radiation (EM radiation) reflected to the amount incident upon it.
Abu Dhabi (Arabic: أب٠ظب٠ʼAbÅ« áºaby) is the largest of the seven emirates that comprise the United Arab Emirates and was also the largest of the former Trucial States. Abu Dhabi is also a city of the same name within the Emirate that is the capital of the country, in north central UAE.


wiki.txt1
Anarchism
is
derived
from
the

wiki.txt2
is
derived
from
the
Greek

wiki.txt12
Anarchism is
is derived
derived from
from the
the Greek

wiki.txt.sorted
wiki.txt.sorted.1
wiki.txt.sorted.nonums
wiki.txt.sorted.nonums1
wiki.txt.sorted.nonums1.uniq
wiki.txt.sorted.nonums1.uniq.2words

export LC_ALL=C

cat
wordnet.index
wordnet.index.sorted
wordnet.index.sorted.uniq
wiki.txt.sorted.nonums1.uniq.2words.cols
wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL
wordnet.index.sorted.uniq.LC_ALL
wiki-wordnet.join


for k in `cat wordnet.index.sorted.uniq.LC_ALL`; do look $k wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL; done >/tar-backups/wiki-wordnet.join



3 word n-grams
grep -v '[^[:print:]]' wiki.txt1 >wiki1.txt
tail +2 wiki1.txt >wiki2.txt
tail +2 wiki2.txt >wiki3.txt

grep "[[:print:]]\+\t[[:print:]]\+\t[[:print:]]" wiki123.txt >wiki123.txt3



find the most used 3 word phrases in a text file
sort rewrite123.txt3 >rewrite123.txt3
grep ".\+\t.\+\t.\+" rewrite123.txt >rewrite123.txt3
sort rewrite123.txt3 >rewrite123.txt3.sorted
uniq -dc rewrite123.txt3.sorted >rewrite123.txt3.sorted.uniqd
sort -nr rewrite123.txt3.sorted.uniqd >rewrite123.txt3.sorted.uniqd.sorted
more rewrite123.txt3.sorted.uniqd.sorted

3 be traveling with
2 you want to
2 want to have
2 student group bookings.
2 part of their
2 on the other
2 notarized letter of
2 made with tap
2 entertainment , activities
2 and want to

Saturday, March 18, 2006

Plot a wav file - wav2png

#!/bin/sh
#usage: wav2png.sh file.wav


BASE=${1%.wav}
PNG=$BASE.png
WAV=$BASE.wav
DAT=$BASE.dat

#echo $BASE,$PNG,$WAV,$DAT
sox $WAV $DAT
grep -v '^;' $DAT >$DAT.clean
FREQ=`head -1 $DAT|tr -d ';'`

echo -e "set terminal png;set title '$FREQ';set output '$PNG'; plot '$DAT.clean'" |gnuplot

Split a midi file into separate tracks

Get the program:

wget http://interglacial.com/~sburke/pub/midi_splitter.pl


perl midi_splitter.pl /tmp/vanilla.mid
file vanilla*.mid
midi2abc vanilla.mid
midi2abc vanilla_t01_c00.mid >vanilla_t01.abc

Thursday, March 16, 2006

Access internet hotspots in client mode with WRT54G

Linksys WRT54G (below version 5 because linksys later switched to vmware)

Load the sveasoft firmware. (I am using Alchemy-pre5.3 v2.04.4.8sv)

Put it in client mode with loopback off.

Here is a good tutorial:
http://www.engadget.com/2005/05/24/how-to-connect-your-linksys-wrt54g-network-to-the-internet/

ssh into the router and type:
~ # wl scan
~ # wl scanresults|tr -d '\n' |sed 's/noise/\n/g' |tr ']' '\n'|grep ^SSID|awk '{print $(NF-1),$2}' |sed 's/Mode://'|sort -n
-96 "zeus"
-94 "LF-X1U.00014A10A7C8"
-94 "wireless"
-89 "Doug"
-88 "GHOST"
-86 "COR-MDA-LAN"
-86 "linksys"
-86 "linksys"
-83 "default"
-83 "hpsetup"
-83 "linksys"
-78 "hpsetup"
-78 "linksys"
-74 "hirsh
-51 "link"

You will get a sorted list of the most powerfull access points in range.