Standard
SQL |
Legacy
SQL |
#standardSQL SELECT repository.url FROM `bigquery-public-data.samples.github_nested` LIMIT 5; |
#legacySQL SELECT repository.url FROM [bigquery-public-data.samples.github_nested] LIMIT 5; |
Numeric |
|
SAFE_CAST(x
AS INT64) |
INTEGER(x) |
SAFE_CAST(x
AS INT64) |
CAST(x
AS INTEGER) |
APPROX_COUNT_DISTINCT(x) |
COUNT(DISTINCT
x) |
COUNT(DISTINCT
x) |
EXACT_COUNT_DISTINCT(x) |
APPROX_QUANTILES(x,
buckets) |
QUANTILES(x, buckets
+ 1) |
APPROX_TOP_COUNT(x,
num) |
TOP(x, num), COUNT(*) |
MOD(x,
y) |
x
% y |
Datetime |
|
TIMESTAMP_DIFF(t1,
t2, DAY) |
DATEDIFF(t1, t2) |
CURRENT_TIMESTAMP |
NOW |
FORMAT_TIMESTAMP(fmt,
t) |
STRFTIME_UTC_USEC(t, fmt) |
TIMESTAMP_TRUNC(t,
DAY) |
UTC_USEC_TO_DAY(t) |
REGEXP_CONTAINS(s,
pattern) |
REGEXP_MATCH(s, pattern) |
x
IS NULL |
IS_NULL(x) |
Strings |
|
SAFE_CAST(x
AS STRING) |
STRING(x) |
SAFE_CAST(x
AS STRING) |
CAST(x
AS STRING) |
SUBSTR(s,
0, len) |
LEFT(s, len) |
SUBSTR(s,
-len) |
RIGHT(s, len) |
STRPOS(s,
"abc") > 0 or s LIKE '%abc%' |
s
CONTAINS "abc" |
STRING_AGG(s,
sep) |
GROUP_CONCAT_UNQUOTED(s, sep) |
IFNULL(LOGICAL_OR(x), false) |
SOME(x) |
IFNULL(LOGICAL_AND(x), true) |
EVERY(x) |
Arrays |
|
ARRAY_AGG(x) |
NEST(x) |
ANY_VALUE(x) |
ANY(x) |
arr[SAFE_ORDINAL(index)] |
NTH(index, arr)
WITHIN RECORD |
ARRAY_LENGTH(arr) |
COUNT(arr)
WITHIN RECORD |
Url / IP Address Functions |
|
NET.HOST(url) |
HOST(url) |
NET.PUBLIC_SUFFIX(url) |
TLD(url) |
NET.REG_DOMAIN(url) |
DOMAIN(url) |
NET.IPV4_TO_INT64( NET.IP_FROM_STRING( addr_string)) |
PARSE_IP(addr_string) |
NET.IP_TO_STRING( NET.IPV4_FROM_INT64( addr_int64
& 0xFFFFFFFF)) |
FORMAT_IP(addr_int64) |
NET.IP_FROM_STRING(addr_string) |
PARSE_PACKED_IP(addr_string) |
NET.IP_TO_STRING(addr_bytes) |
FORMAT_PACKED_IP(addr_bytes) |
Tips and good resources for all.. Oracle, Big data, Hadoop, Unix, Linux, Cloud, etc.
Sunday, May 30, 2021
Standard SQL vs Legacy SQL - functions
Wednesday, May 5, 2021
Difference between Cloud Storage and HDFS
Feature | Cloud Storage | HDFS |
I/O variance | Higher variance We can avoid it by using caching and read replica, considering cloud big table and cloud datastore. | Lower variance |
Support of file appends or truncates | No can do overwrite of files | Yes |
POSIX - compliant | not compliant | not fully compliant |
file system information | no information available | "hadoop fsck --files --blocks" - exposes all directory information's in HDFS. |
request latency | greater round trip latency | lesser latency |
cost | 57% lower total cost of ownership | Much higher compared to Cloud |
separation between cloud storage and HDFS | Multiple clusters can access same location usage of gs:// in cloud workloads much better performance on Persistent Disk | Single cluster access access using hdfs:// with in cluster lesser performance due to low grade HDD |
interoperability | seamless connectivity between spark and hadoop instances. exported files can import in big query. easier to manage cloud dataflow tasks | standalone cluster instances operations manual access provision between instances and storage |
storage management | Lesser overhead on operations team | Much overhead on operations team |
startup time | quicker job startup time during job execution | it takes significant time to start up jobs |
security | cloud inbuilt security and access using IAM | manual security and access setup. changing security key is an overhead to operations teams |
Saturday, August 24, 2019
create ssh keys and copy it on the remote server
Step 1: Create public and private keys using ssh-key-gen on local-host
mkdir -p $HOME/.ssh
chmod 0700 $HOME/.ssh
ssh-keygen -t rsa -b 4096 <> -f ~/.ssh/newremoteweb.key <> -C "My new remote web key" <>
ssh-keygen -t rsa -P ""
Step 2: Copy the public key to remote-host using ssh-copy-id
ssh-copy-id -i ~/.ssh/id_rsa.pub remote-host
ssh-copy-id -i ~/.ssh/id_rsa.pub remote-user@remote-host
scp $HOME/.ssh/id_rsa.pub remote-user@remote-host:~/.ssh/authorized_keys
adding the public key in remote server
## First create .ssh directory on server
ssh remote-user@remote-host "umask 077; test -d .ssh || mkdir .ssh"
## cat local id.rsa.pub file and pipe over ssh to append the public key in remote servercat $HOME/.ssh/id_rsa.pub | ssh remote-user@remote-host "cat >> .ssh/authorized_keys"
Now try logging into the machine, with "ssh 'remote-host'", and check in:
.ssh/authorized_keys
Step 3: Login to remote-host without entering the password
mkdir -p $HOME/.ssh
chmod 0700 $HOME/.ssh
ssh-keygen -t rsa -b 4096 <
ssh-keygen -t rsa -P ""
Step 2: Copy the public key to remote-host using ssh-copy-id
ssh-copy-id -i ~/.ssh/id_rsa.pub remote-host
ssh-copy-id -i ~/.ssh/id_rsa.pub remote-user@remote-host
scp $HOME/.ssh/id_rsa.pub remote-user@remote-host:~/.ssh/authorized_keys
adding the public key in remote server
## First create .ssh directory on server
ssh remote-user@remote-host "umask 077; test -d .ssh || mkdir .ssh"
## cat local id.rsa.pub file and pipe over ssh to append the public key in remote servercat $HOME/.ssh/id_rsa.pub | ssh remote-user@remote-host "cat >> .ssh/authorized_keys"
Now try logging into the machine, with "ssh 'remote-host'", and check in:
.ssh/authorized_keys
Step 3: Login to remote-host without entering the password
ssh remote-host
It should connect without password. Our testing is over..
Some important stuffs which you can consider.
$HOME/.ssh/id_rsa– contains your private key.
$HOME/.ssh/id_rsa.pub – contain your public key.
restarting ssh daemon/service on the server.
## on centos/rhel/fedora
sudo service sshd reload
## on linux server
sudo systemctl reload sshd
## on debian/ubuntu - older version
sudo /etc/init.d/ssh reload
## on debian/ubuntu - latest
sudo systemctl reload ssh
## common method to reload sshd
sudo kill -HUP `cat /var/run/sshd.pid`
or
sudo kill -HUP $(cat /var/run/sshd.pid)
## list all public key parameters of all identities from the ssh-agent:
ssh-add -L
## deleting all private keys from the ssh-agent:
ssh-add -D
## kill the ssh agent, which is running:
kill $SSH_AGENT_PID
trap "kill $SSH_AGENT_PID" 0
It should connect without password. Our testing is over..
Some important stuffs which you can consider.
$HOME/.ssh/id_rsa– contains your private key.
$HOME/.ssh/id_rsa.pub – contain your public key.
restarting ssh daemon/service on the server.
## on centos/rhel/fedora
sudo service sshd reload
## on linux server
sudo systemctl reload sshd
## on debian/ubuntu - older version
sudo /etc/init.d/ssh reload
## on debian/ubuntu - latest
sudo systemctl reload ssh
## common method to reload sshd
sudo kill -HUP `cat /var/run/sshd.pid`
or
sudo kill -HUP $(cat /var/run/sshd.pid)
## list all public key parameters of all identities from the ssh-agent:
ssh-add -L
## deleting all private keys from the ssh-agent:
ssh-add -D
## kill the ssh agent, which is running:
kill $SSH_AGENT_PID
trap "kill $SSH_AGENT_PID" 0
Subscribe to:
Posts (Atom)