MadDataSc Profile Banner
Dmitri Apassov Profile
Dmitri Apassov

@MadDataSc

Followers
57
Following
8
Media
1
Statuses
290

math pro, BI specialist, aspiring data miner

Stockholm
Joined March 2012
Don't wanna be here? Send us removal request.
@MadDataSc
Dmitri Apassov
5 years
#pentaho #pdi question
0
0
0
@grok
Grok
6 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
385
674
3K
@MadDataSc
Dmitri Apassov
5 years
#PDI #kettle #pentaho [rows with date] --> [job executor] does "execute job for each date".
0
0
0
@MadDataSc
Dmitri Apassov
5 years
#PDI #kettle #pentaho If(like me) no javascript skills, use built-in #h2 database to generate rowsets with familiar constants like "today":.H2 conn- localhost/mem:db; port 8082.statement- SELECT FORMATDATETIME(TODAY,'yyyy-MM-dd').
0
0
0
@MadDataSc
Dmitri Apassov
5 years
RT @Pentaho: Learn basic functions and capabilities of PDI while gaining insight into best practices for successful use in real world cases….
0
4
0
@MadDataSc
Dmitri Apassov
5 years
@codek1 @Pentaho (1) put mssql-jdbc jar into data-integration\lib\ (2)put mssql-jdbc_auth-8.2.1.x64.dll into data-integration\libswt\win64\.(3)make conn as Native with pwd and user (4)Check “use integrated security” (5)for “options” add properties:authenticationScheme - NTLM; domainName - domain.
0
0
0
@MadDataSc
Dmitri Apassov
5 years
1
0
0
@MadDataSc
Dmitri Apassov
6 years
0
1
0
@MadDataSc
Dmitri Apassov
6 years
Is it crucial to run #carte or can we just spin up a #pdi container and #cron #kitchen runs from inside it? Log files to a #db somewhere.
0
0
0
@MadDataSc
Dmitri Apassov
6 years
local (dev) file repo in .kettle/repositories.xml, define one that you pull from bitbucket/github in #dockerfile.
0
0
0
@MadDataSc
Dmitri Apassov
6 years
Dev (locally) point jndi to dev redshift. In #dockerfile define JNDI to production db.
0
0
0
@MadDataSc
Dmitri Apassov
6 years
Right, so aws credentials would go as env variables into and redshift connection as JNDI.
1
0
0
@MadDataSc
Dmitri Apassov
6 years
#Carte #dockerfile needs to include #aws #sdk installation and downloading and moving #redshift #jdbc driver to /lib.
0
1
2
@MadDataSc
Dmitri Apassov
6 years
1) create a bitbucket repo for ktr, properties and jndi. 2) locally always pull before exec.3) in #dockerfile add "make ktr repo", "pull from repo" and "replace stock config".4) all relative paths.
1
0
0
@MadDataSc
Dmitri Apassov
6 years
Looking for #docker for #dummies #resources.
0
0
0
@MadDataSc
Dmitri Apassov
6 years
new worklpace, new challenges. Learning #DOCKER, running #pentaho #pdi from it.
0
0
2
@MadDataSc
Dmitri Apassov
6 years
When number of attributes (staged as #JSON) is dynamic, model only business keys and their relationships (#HUBS n'#LINKS) in #datavault. Model whole JSON into "JSON"-sattelite. Extraction of concrete attribs is def by #reporting needs.
0
0
1
@MadDataSc
Dmitri Apassov
6 years
make it into dictionary with "-enclosed keys and values, wrap it in {} to make it a #json column, use json_extract_path_text(string, key) to extract needed value.
1
0
0
@MadDataSc
Dmitri Apassov
6 years
runtime dynamic #pivot of dataset [key, state, timestamp] can be #mimicked in #redshift #aws by using #LISTAGG (state) within group(order by timestamp).group by key.
2
0
0