How do you perform awk type functions in Mysql?
Say you have a list of urls and you want to find the most common domain names hosting gif files.
http://www.fileden.com/123/foo.gif
http://h1.ripway.com/11.gif
http://www.hostmyfile.info/22/web.gif
http://www.fileden.com/24/fun.gif
If this list is in a file you can use standard linux commands to find the most common domains.
grep '\.mp3' my_pictures |awk -F'/' '{print $3}' |sort |uniq -c|sort -nr
If the list is in a mysql database you can use locate() and substring() to accomplish the same thing!
SELECT count(*) as count, substring(url,1,locate('/',url,9)) as f FROM `my_pictures` WHERE url IS NOT NULL and url !='' and filename like '%.mp3' group by f order by count desc
Happy Awking!
Saturday, August 19, 2006
Tuesday, July 11, 2006
phpMyAdmin #1045 - Access denied for user 'root'@'localhost' (using password: NO)
When installing phpMyAdmin the most common error is not being able to connect to the database after installation.
#1045 - Access denied for user 'root'@'localhost' (using password: NO)
Here is the solution:
Make sure your config file in the phpMyAdmin directory is named "config.inc.php"!
In the installation instuctions it says to copy config.default.php into the phpMyAdmin directory and then customize it (usually the only change needed is to change the 'auth_type' is changed from 'config' to 'http'). Make sure to rename config.default.php to config.inc.php.
Tuesday, July 04, 2006
Get the Latidude and Longitude from an IP
ip2location shell script
for k in `cat ip_list`;
do
id=`echo $k|awk -F_ '{print $1}'`;
ip_num=`echo $k|awk -F_ '{print $2}'|awk -F. '{ printf "%10d\n",$1 * 16777216 + $2 * 65536 + $3 * 256 + $4 }'`;
echo id=$id;
echo ip_num=$ip_num;
echo "SELECT ip_latitude,ip_longitude,ip_city FROM ip2location WHERE (ip_from <= $ip_num) AND (ip_to >= $ip_num) LIMIT 1"|mysql -u doug crostel_store;
done
UPDATE my_space_users u,
ip2location i SET u.user_location = i.ip_city WHERE (
ip_from <=1091213045
) AND (
ip_to >=1091213045
) LIMIT 2
mysql> update my_space_users u, ip2location l set u.ip_country=l.country_short, u.ip_region=l.ip_region, u.ip_city=l.ip_city, u.ip_latitude=l.ip_latitude, u.ip_longitude=l.ip_longitude, u.ip_isp=l.ip_isp, u.ip_domain=l.ip_domain WHERE (l.ip_from <= u.reg_ip_num) AND (l.ip_to >= u.reg_ip_num);
for k in `cat ip_list`;
do
id=`echo $k|awk -F_ '{print $1}'`;
ip_num=`echo $k|awk -F_ '{print $2}'|awk -F. '{ printf "%10d\n",$1 * 16777216 + $2 * 65536 + $3 * 256 + $4 }'`;
echo id=$id;
echo ip_num=$ip_num;
echo "SELECT ip_latitude,ip_longitude,ip_city FROM ip2location WHERE (ip_from <= $ip_num) AND (ip_to >= $ip_num) LIMIT 1"|mysql -u doug crostel_store;
done
UPDATE my_space_users u,
ip2location i SET u.user_location = i.ip_city WHERE (
ip_from <=1091213045
) AND (
ip_to >=1091213045
) LIMIT 2
mysql> update my_space_users u, ip2location l set u.ip_country=l.country_short, u.ip_region=l.ip_region, u.ip_city=l.ip_city, u.ip_latitude=l.ip_latitude, u.ip_longitude=l.ip_longitude, u.ip_isp=l.ip_isp, u.ip_domain=l.ip_domain WHERE (l.ip_from <= u.reg_ip_num) AND (l.ip_to >= u.reg_ip_num);
Thursday, April 13, 2006
Using Wikipedia as a natural language corpus
1) download the english yahoo abstracts from
wiki.txt
Anarchism is derived from the Greek αναÏÏία ("without archons (ruler, chief, king)"). Anarchism as a political philosophy, is the belief that rulers, governments, and hierarchal social relationships are unnecessary and should be abolished, although there are differing interpretations of what this means.
|
Albedo is the measure of reflectivity of a surface or body. It is the ratio of electromagnetic radiation (EM radiation) reflected to the amount incident upon it.
Abu Dhabi (Arabic: أب٠ظب٠ʼAbÅ« áºaby) is the largest of the seven emirates that comprise the United Arab Emirates and was also the largest of the former Trucial States. Abu Dhabi is also a city of the same name within the Emirate that is the capital of the country, in north central UAE.
wiki.txt1
Anarchism
is
derived
from
the
wiki.txt2
is
derived
from
the
Greek
wiki.txt12
Anarchism is
is derived
derived from
from the
the Greek
wiki.txt.sorted
wiki.txt.sorted.1
wiki.txt.sorted.nonums
wiki.txt.sorted.nonums1
wiki.txt.sorted.nonums1.uniq
wiki.txt.sorted.nonums1.uniq.2words
export LC_ALL=C
cat
wordnet.index
wordnet.index.sorted
wordnet.index.sorted.uniq
wiki.txt.sorted.nonums1.uniq.2words.cols
wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL
wordnet.index.sorted.uniq.LC_ALL
wiki-wordnet.join
for k in `cat wordnet.index.sorted.uniq.LC_ALL`; do look $k wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL; done >/tar-backups/wiki-wordnet.join
3 word n-grams
grep -v '[^[:print:]]' wiki.txt1 >wiki1.txt
tail +2 wiki1.txt >wiki2.txt
tail +2 wiki2.txt >wiki3.txt
grep "[[:print:]]\+\t[[:print:]]\+\t[[:print:]]" wiki123.txt >wiki123.txt3
find the most used 3 word phrases in a text file
sort rewrite123.txt3 >rewrite123.txt3
grep ".\+\t.\+\t.\+" rewrite123.txt >rewrite123.txt3
sort rewrite123.txt3 >rewrite123.txt3.sorted
uniq -dc rewrite123.txt3.sorted >rewrite123.txt3.sorted.uniqd
sort -nr rewrite123.txt3.sorted.uniqd >rewrite123.txt3.sorted.uniqd.sorted
more rewrite123.txt3.sorted.uniqd.sorted
3 be traveling with
2 you want to
2 want to have
2 student group bookings.
2 part of their
2 on the other
2 notarized letter of
2 made with tap
2 entertainment , activities
2 and want to
http://en.wikipedia.org/wiki/Sort-Merge_Join
wiki.txt
Anarchism is derived from the Greek αναÏÏία ("without archons (ruler, chief, king)"). Anarchism as a political philosophy, is the belief that rulers, governments, and hierarchal social relationships are unnecessary and should be abolished, although there are differing interpretations of what this means.
|
Albedo is the measure of reflectivity of a surface or body. It is the ratio of electromagnetic radiation (EM radiation) reflected to the amount incident upon it.
Abu Dhabi (Arabic: أب٠ظب٠ʼAbÅ« áºaby) is the largest of the seven emirates that comprise the United Arab Emirates and was also the largest of the former Trucial States. Abu Dhabi is also a city of the same name within the Emirate that is the capital of the country, in north central UAE.
wiki.txt1
Anarchism
is
derived
from
the
wiki.txt2
is
derived
from
the
Greek
wiki.txt12
Anarchism is
is derived
derived from
from the
the Greek
wiki.txt.sorted
wiki.txt.sorted.1
wiki.txt.sorted.nonums
wiki.txt.sorted.nonums1
wiki.txt.sorted.nonums1.uniq
wiki.txt.sorted.nonums1.uniq.2words
export LC_ALL=C
cat
wordnet.index
wordnet.index.sorted
wordnet.index.sorted.uniq
wiki.txt.sorted.nonums1.uniq.2words.cols
wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL
wordnet.index.sorted.uniq.LC_ALL
wiki-wordnet.join
for k in `cat wordnet.index.sorted.uniq.LC_ALL`; do look $k wiki.txt.sorted.nonums1.uniq.2words.cols.LC_ALL; done >/tar-backups/wiki-wordnet.join
3 word n-grams
grep -v '[^[:print:]]' wiki.txt1 >wiki1.txt
tail +2 wiki1.txt >wiki2.txt
tail +2 wiki2.txt >wiki3.txt
grep "[[:print:]]\+\t[[:print:]]\+\t[[:print:]]" wiki123.txt >wiki123.txt3
find the most used 3 word phrases in a text file
sort rewrite123.txt3 >rewrite123.txt3
grep ".\+\t.\+\t.\+" rewrite123.txt >rewrite123.txt3
sort rewrite123.txt3 >rewrite123.txt3.sorted
uniq -dc rewrite123.txt3.sorted >rewrite123.txt3.sorted.uniqd
sort -nr rewrite123.txt3.sorted.uniqd >rewrite123.txt3.sorted.uniqd.sorted
more rewrite123.txt3.sorted.uniqd.sorted
3 be traveling with
2 you want to
2 want to have
2 student group bookings.
2 part of their
2 on the other
2 notarized letter of
2 made with tap
2 entertainment , activities
2 and want to
Saturday, March 18, 2006
Plot a wav file - wav2png
#!/bin/sh
#usage: wav2png.sh file.wav
BASE=${1%.wav}
PNG=$BASE.png
WAV=$BASE.wav
DAT=$BASE.dat
#echo $BASE,$PNG,$WAV,$DAT
sox $WAV $DAT
grep -v '^;' $DAT >$DAT.clean
FREQ=`head -1 $DAT|tr -d ';'`
echo -e "set terminal png;set title '$FREQ';set output '$PNG'; plot '$DAT.clean'" |gnuplot
#usage: wav2png.sh file.wav
BASE=${1%.wav}
PNG=$BASE.png
WAV=$BASE.wav
DAT=$BASE.dat
#echo $BASE,$PNG,$WAV,$DAT
sox $WAV $DAT
grep -v '^;' $DAT >$DAT.clean
FREQ=`head -1 $DAT|tr -d ';'`
echo -e "set terminal png;set title '$FREQ';set output '$PNG'; plot '$DAT.clean'" |gnuplot
Split a midi file into separate tracks
Get the program:
wget http://interglacial.com/~sburke/pub/midi_splitter.pl
perl midi_splitter.pl /tmp/vanilla.mid
file vanilla*.mid
midi2abc vanilla.mid
midi2abc vanilla_t01_c00.mid >vanilla_t01.abc
wget http://interglacial.com/~sburke/pub/midi_splitter.pl
perl midi_splitter.pl /tmp/vanilla.mid
file vanilla*.mid
midi2abc vanilla.mid
midi2abc vanilla_t01_c00.mid >vanilla_t01.abc
Thursday, March 16, 2006
Access internet hotspots in client mode with WRT54G
Linksys WRT54G (below version 5 because linksys later switched to vmware)
Load the sveasoft firmware. (I am using Alchemy-pre5.3 v2.04.4.8sv)
Put it in client mode with loopback off.
Here is a good tutorial:
http://www.engadget.com/2005/05/24/how-to-connect-your-linksys-wrt54g-network-to-the-internet/
ssh into the router and type:
~ # wl scan
~ # wl scanresults|tr -d '\n' |sed 's/noise/\n/g' |tr ']' '\n'|grep ^SSID|awk '{print $(NF-1),$2}' |sed 's/Mode://'|sort -n
-96 "zeus"
-94 "LF-X1U.00014A10A7C8"
-94 "wireless"
-89 "Doug"
-88 "GHOST"
-86 "COR-MDA-LAN"
-86 "linksys"
-86 "linksys"
-83 "default"
-83 "hpsetup"
-83 "linksys"
-78 "hpsetup"
-78 "linksys"
-74 "hirsh
-51 "link"
You will get a sorted list of the most powerfull access points in range.
Load the sveasoft firmware. (I am using Alchemy-pre5.3 v2.04.4.8sv)
Put it in client mode with loopback off.
Here is a good tutorial:
http://www.engadget.com/2005/05/24/how-to-connect-your-linksys-wrt54g-network-to-the-internet/
ssh into the router and type:
~ # wl scan
~ # wl scanresults|tr -d '\n' |sed 's/noise/\n/g' |tr ']' '\n'|grep ^SSID|awk '{print $(NF-1),$2}' |sed 's/Mode://'|sort -n
-96 "zeus"
-94 "LF-X1U.00014A10A7C8"
-94 "wireless"
-89 "Doug"
-88 "GHOST"
-86 "COR-MDA-LAN"
-86 "linksys"
-86 "linksys"
-83 "default"
-83 "hpsetup"
-83 "linksys"
-78 "hpsetup"
-78 "linksys"
-74 "hirsh
-51 "link"
You will get a sorted list of the most powerfull access points in range.
Tuesday, February 28, 2006
Split a polyphonic midi file into monophonic tracks
perl midi_splitter.pl /tmp/vanilla.mid
Subscribe to:
Posts (Atom)