[go: up one dir, main page]

0% found this document useful (0 votes)
347 views29 pages

User Guide For Powercenter: Informatica Powerexchange For Amazon Redshift (Version 10.0)

Amazon Redshift User Guide For PowerCenter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
347 views29 pages

User Guide For Powercenter: Informatica Powerexchange For Amazon Redshift (Version 10.0)

Amazon Redshift User Guide For PowerCenter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Informatica PowerExchange for Amazon

Redshift (Version 10.0)

User Guide for PowerCenter


Informatica PowerExchange for Amazon Redshift User Guide for PowerCenter

Version 10.0
November 2015

Copyright (c) 1993-2015 Informatica LLC. All rights reserved.

This software and documentation contain proprietary information of Informatica LLC and are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any
form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. This Software may be protected by U.S. and/or
international Patents and other Patents Pending.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14
(ALT III), as applicable.

The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us
in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica
On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and
Informatica Master Data Management are trademarks or registered trademarks of Informatica LLC in the United States and in jurisdictions throughout the world. All
other company and product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights
reserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta
Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems
Incorporated. All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All
rights reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights
reserved. Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights
reserved. Copyright © Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.
Copyright Cleo Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-
technologies GmbH. All rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. Copyright © International Business Machines Corporation. All rights
reserved. Copyright © yWorks GmbH. All rights reserved. Copyright © Lucent Technologies. All rights reserved. Copyright (c) University of Toronto. All rights reserved.
Copyright © Daniel Veillard. All rights reserved. Copyright © Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. All
rights reserved. Copyright © PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, All
rights reserved. Copyright © Red Hat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright
© EMC Corporation. All rights reserved. Copyright © Flexera Software. All rights reserved. Copyright © Jinfonet Software. All rights reserved. Copyright © Apple Inc. All
rights reserved. Copyright © Telerik Inc. All rights reserved. Copyright © BEA Systems. All rights reserved. Copyright © PDFlib GmbH. All rights reserved. Copyright ©
Orientation in Objects GmbH. All rights reserved. Copyright © Tanuki Software, Ltd. All rights reserved. Copyright © Ricebridge. All rights reserved. Copyright © Sencha,
Inc. All rights reserved. Copyright © Scalable Systems, Inc. All rights reserved. Copyright © jQWidgets. All rights reserved. Copyright © Tableau Software, Inc. All rights
reserved. Copyright© MaxMind, Inc. All Rights Reserved. Copyright © TMate Software s.r.o. All rights reserved. Copyright © MapR Technologies Inc. All rights reserved.
Copyright © Amazon Corporate LLC. All rights reserved. Copyright © Highsoft. All rights reserved. Copyright © Python Software Foundation. All rights reserved.
Copyright © BeOpen.com. All rights reserved. Copyright © CNRI. All rights reserved.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various versions
of the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or agreed to in
writing, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.

This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software
copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License
Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.

The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,
Irvine, and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and
redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.

This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, <daniel@haxx.se>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or
without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.dom4j.org/ license.html.

The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to
terms available at http://dojotoolkit.org/license.

This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations
regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.

This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at
http:// www.gnu.org/software/ kawa/Software-License.html.

This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are
subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.

This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at
http:// www.pcre.org/license.txt.

This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http://asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/
license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-
agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html;
http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/
2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://
forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://
www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http://
www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/
license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; http://www.schneier.com/blowfish.html; http://
www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/EaselJS/blob/master/src/easeljs/display/Bitmap.js;
http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://jdbc.postgresql.org/license.html; http://
protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/LICENSE; http://web.mit.edu/Kerberos/krb5-
current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/master/LICENSE; https://github.com/hjiang/jsonxx/
blob/master/LICENSE; https://code.google.com/p/lz4/; https://github.com/jedisct1/libsodium/blob/master/LICENSE; http://one-jar.sourceforge.net/index.php?
page=documents&file=license; https://github.com/EsotericSoftware/kryo/blob/master/license.txt; http://www.scala-lang.org/license.html; https://github.com/tinkerpop/
blueprints/blob/master/LICENSE.txt; http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html; https://aws.amazon.com/asl/; https://github.com/
twbs/bootstrap/blob/master/LICENSE; https://sourceforge.net/p/xmlunit/code/HEAD/tree/trunk/LICENSE.txt; https://github.com/documentcloud/underscore-contrib/blob/
master/LICENSE, and https://github.com/apache/hbase/blob/master/LICENSE.txt.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code License
Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://opensource.org/
licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/licenses/artistic-
license-1.0) and the Initial Developer’s Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).

This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.

This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.

See patents at https://www.informatica.com/legal/patents.html.

DISCLAIMER: Informatica LLC provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica LLC does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.

NOTICES

This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:

1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT
INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT
LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.

Part Number: PWX-ARU-10000-0001


Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica My Support Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Product Availability Matrixes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Support YouTube Channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Marketplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 1: Introduction to PowerExchange for Amazon Redshift. . . . . . . . . . . . . . . . 9


PowerExchange for Amazon Redshift Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
PowerCenter Integration Service and Amazon Redshift Integration. . . . . . . . . . . . . . . . . . . . . . 10
Introduction to Amazon Redshift. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Chapter 2: PowerExchange for Amazon Redshift Installation and


Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
PowerExchange for Amazon Redshift Installation and Configuration Overview. . . . . . . . . . . . . . 12
Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Installing the Server Component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Installing the Server Component on Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Installing the Server Component on Red Hat Enterprise Linux. . . . . . . . . . . . . . . . . . . . . . 13
Installing the Client Component. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Registering the Plug-in. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Chapter 3: Amazon Redshift Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


Amazon Redshift Sources and Targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Import Amazon Redshift Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Chapter 4: Amazon Redshift Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18


Amazon Redshift Sessions Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Amazon Redshift Connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Amazon Redshift Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Configuring an Amazon Redshift Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Configuring the Source Qualifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Amazon Redshift Source Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Client-side Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Table of Contents
Identity Columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Amazon Redshift Source Session Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Amazon Redshift Target Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Amazon Redshift Staging Directory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Vacuum Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Amazon Redshift Target Session Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Configuring HTTP Proxy Options at Design-Time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Configuring HTTP Proxy Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Success and Error Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Success Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Error Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Appendix A: Amazon Redshift Data Type Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . 27


Data Type Reference Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Amazon Redshift and Transformation Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Table of Contents 5
Preface
The Informatica PowerExchange for Amazon Redshift User Guide for PowerCenter describes how to read
data from and write data to an Amazon Redshift target. The guide is written for database administrators and
developers who are responsible for moving data from a source to an Amazon Redshift target, and from an
Amazon Redshift source to a target. This guide assumes that you have knowledge of database engines,
Amazon Redshift, and PowerCenter.

Informatica Resources

Informatica My Support Portal


As an Informatica customer, the first step in reaching out to Informatica is through the Informatica My Support
Portal at https://mysupport.informatica.com. The My Support Portal is the largest online data integration
collaboration platform with over 100,000 Informatica customers and partners worldwide.

As a member, you can:

• Access all of your Informatica resources in one place.


• Review your support cases.
• Search the Knowledge Base, find product documentation, access how-to documents, and watch support
videos.
• Find your local Informatica User Group Network and collaborate with your peers.

Informatica Documentation
The Informatica Documentation team makes every effort to create accurate, usable documentation. If you
have questions, comments, or ideas about this documentation, contact the Informatica Documentation team
through email at infa_documentation@informatica.com. We will use your feedback to improve our
documentation. Let us know if we can contact you regarding your comments.

The Documentation team updates documentation as needed. To get the latest documentation for your
product, navigate to Product Documentation from https://mysupport.informatica.com.

Informatica Product Availability Matrixes


Product Availability Matrixes (PAMs) indicate the versions of operating systems, databases, and other types
of data sources and targets that a product release supports. You can access the PAMs on the Informatica My
Support Portal at https://mysupport.informatica.com.

6
Informatica Web Site
You can access the Informatica corporate web site at https://www.informatica.com. The site contains
information about Informatica, its background, upcoming events, and sales offices. You will also find product
and partner information. The services area of the site includes important information about technical support,
training and education, and implementation services.

Informatica How-To Library


As an Informatica customer, you can access the Informatica How-To Library at
https://mysupport.informatica.com. The How-To Library is a collection of resources to help you learn more
about Informatica products and features. It includes articles and interactive demonstrations that provide
solutions to common problems, compare features and behaviors, and guide you through performing specific
real-world tasks.

Informatica Knowledge Base


As an Informatica customer, you can access the Informatica Knowledge Base at
https://mysupport.informatica.com. Use the Knowledge Base to search for documented solutions to known
technical issues about Informatica products. You can also find answers to frequently asked questions,
technical white papers, and technical tips. If you have questions, comments, or ideas about the Knowledge
Base, contact the Informatica Knowledge Base team through email at KB_Feedback@informatica.com.

Informatica Support YouTube Channel


You can access the Informatica Support YouTube channel at http://www.youtube.com/user/INFASupport. The
Informatica Support YouTube channel includes videos about solutions that guide you through performing
specific tasks. If you have questions, comments, or ideas about the Informatica Support YouTube channel,
contact the Support YouTube team through email at supportvideos@informatica.com or send a tweet to
@INFASupport.

Informatica Marketplace
The Informatica Marketplace is a forum where developers and partners can share solutions that augment,
extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions
available on the Marketplace, you can improve your productivity and speed up time to implementation on
your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com.

Informatica Velocity
You can access Informatica Velocity at https://mysupport.informatica.com. Developed from the real-world
experience of hundreds of data management projects, Informatica Velocity represents the collective
knowledge of our consultants who have worked with organizations from around the world to plan, develop,
deploy, and maintain successful data management solutions. If you have questions, comments, or ideas
about Informatica Velocity, contact Informatica Professional Services at ips@informatica.com.

Informatica Global Customer Support


You can contact a Customer Support Center by telephone or through the Online Support.

Online Support requires a user name and password. You can request a user name and password at
http://mysupport.informatica.com.

Preface 7
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/.

8 Preface
CHAPTER 1

Introduction to PowerExchange
for Amazon Redshift
This chapter includes the following topics:

• PowerExchange for Amazon Redshift Overview, 9


• PowerCenter Integration Service and Amazon Redshift Integration, 10
• Introduction to Amazon Redshift, 11

PowerExchange for Amazon Redshift Overview


You can use PowerExchange for Amazon Redshift to read data from or write data to Amazon Redshift. You
can also use PowerExchange for Amazon Redshift to read data from Amazon Redshift views.

Amazon Redshift views contain information about the functioning of the Amazon Redshift system. You can
run a query on views like you run a query on database tables.

You can use Amazon Redshift objects as sources and targets in mappings. When you use Amazon Redshift
objects in mappings, you must configure properties specific to Amazon Redshift.

You can configure HTTPS proxy to connect to Amazon Redshift. You can also configure an SSL connection
to connect to Amazon Redshift.

The PowerCenter Integration Service uses the Amazon driver to communicate with Amazon Redshift.

Example
You work for an organization that stores purchase order details, such as customer ID, item codes, and item
quantity in an on-premise MySQL database. You need to analyze purchase order details and move data from
the on-premise MySQL database to an affordable cloud-based environment. Create a mapping to read all the
purchase records from the MySQL database and write them to an Amazon Redshift target for data analysis.

9
PowerCenter Integration Service and Amazon
Redshift Integration
The PowerCenter Integration Service uses the Amazon Redshift connection to connect to Amazon Redshift.

The following image shows how PowerCenter connects to Amazon Redshift to read data:

When you run the Amazon Redshift session, the PowerCenter Integration Service reads data from Amazon
Redshift based on the workflow and Amazon Redshift connection configuration. The PowerCenter Integration
Service connects and reads data from Amazon Simple Storage Service (Amazon S3) through a TCP/IP
network. The PowerCenter Integration Service then stores data in a staging directory on the PowerCenter
machine. Amazon S3 is a storage service in which you can copy data from source and simultaneously move
data to any target. The PowerCenter Integration Service issues a copy command that copies data from
Amazon S3 to the target.

The following image shows how PowerCenter connects to Amazon Redshift to write data:

When you run the Amazon Redshift session, the PowerCenter Integration Service writes data to Amazon
Redshift based on the workflow and Amazon Redshift connection configuration. The PowerCenter Integration
Service stores data in a staging directory on the PowerCenter machine. The PowerCenter Integration Service
then connects and writes data to Amazon Simple Storage Service (Amazon S3) through a TCP/IP network.
Amazon S3 is a storage service in which you can copy data from source and simultaneously move data to
Amazon Redshift clusters. The PowerCenter Integration Service issues a copy command that copies data
from Amazon S3 to the Amazon Redshift target table.

10 Chapter 1: Introduction to PowerExchange for Amazon Redshift


Introduction to Amazon Redshift
Amazon Redshift is a cloud-based petabyte-scale data warehouse service that organizations can use to
analyze and store data.

Amazon Redshift uses columnar data storage, parallel processing, and data compression to store data and to
achieve fast query execution. Amazon Redshift uses a cluster-based architecture that consists of a leader
node and compute nodes. The leader node manages the compute nodes and communicates with the external
client programs. The leader node interacts with the client applications and communicates with compute
nodes. A compute node stores data and runs queries for the leader node. Any client that uses a PostgreSQL
driver can communicate with Amazon Redshift.

Introduction to Amazon Redshift 11


CHAPTER 2

PowerExchange for Amazon


Redshift Installation and
Configuration
This chapter includes the following topics:

• PowerExchange for Amazon Redshift Installation and Configuration Overview, 12


• Prerequisites, 12
• Installing the Server Component, 13
• Installing the Client Component, 14
• Registering the Plug-in, 14

PowerExchange for Amazon Redshift Installation and


Configuration Overview
You can install PowerExchange for Amazon Redshift on Windows 32-bit, Windows 64-bit, or Red Hat
Enterprise Linux 64-bit machines.

When you install the PowerExchange for Amazon Redshift server component, you enable the PowerCenter
Integration Service to read data from or write data to Amazon Redshift.

Prerequisites
Before you can use PowerExchange for Amazon Redshift, perform the following tasks:

1. Install or upgrade to PowerCenter 9.6.1.3.1.


2. Apply EBF 16130.
3. Verify that you can connect to Amazon Redshift with an SQL client that uses the PostgreSQL driver.
For example, you can use SQL Workbench/J to connect to Amazon Redshift.

12
4. Verify that you have read, write, and execute permissions on the following directories:
<Informatica Installation Directory>/server/bin

Installing the Server Component


The PowerExchange for Amazon Redshift server component installs the PowerCenter Integration Service
and PowerCenter Repository Service components.

If you configure the PowerCenter Integration Service or PowerCenter Repository Service to run on primary
and backup nodes, install the PowerExchange for Amazon Redshift server component on each node
configured to run the PowerCenter Integration Service or PowerCenter Repository Service.

If you configure the PowerCenter Integration Service to run on a grid, install the PowerCenter for Amazon
Redshift server component on each node configured to run on the grid. If you cannot install the PowerCenter
for Amazon Redshift server component on each node on the grid, create a resource in the domain and assign
it to each node where you installed the PowerCenter for Amazon Redshift server component. When you
create a session, configure the session to use the resource.

Installing the Server Component on Windows


You can install the PowerExchange for Amazon Redshift server component on a Windows 32-bit or Windows
64-bit machine. The PowerExchange for Amazon Redshift server component installs the PowerCenter
Integration Service and PowerCenter Repository Service components.

1. Run install.bat from the installation package.


2. Click Next.
3. Select the Informatica installation directory.
By default, the server components are installed in the following location:
C:\Informatica installation directory\<version folder>
4. Click Next.
5. Click Install to begin the installation.
6. Click Done when the installation is complete.
The PowerCenter Integration Service and PowerCenter Repository Service components are installed.

Installing the Server Component on Red Hat Enterprise Linux


Install the PowerExchange for Amazon Redshift server component on a Red Hat Enterprise Linux 64-bit
machine when the PowerCenter Integration Service or PowerCenter Repository Service runs on Linux.

1. Enter sh install.sh at the prompt.


2. Enter the path to the Informatica installation directory.
By default, the server components are installed in the following location:
<User Home Directory>/Informatica/<version folder>
The PowerCenter Integration Service and PowerCenter Repository Service components are installed.

Installing the Server Component 13


Installing the Client Component
Install the Client component on every PowerCenter Client machine that connects to the domain where the
PowerExchange for Amazon Redshift server is installed.

1. Unzip the installation archive and navigate to the root directory of the extracted installer files.
2. Run the install.bat script file.
The Welcome page appears.
3. Click Next.
The Installation Directory page appears.
4. Enter the absolute path to the Informatica installation directory. Click Browse to find the directory or use
the default directory.
By default, the PowerCenter client is installed in the following location:
C:\Informatica\<version folder>
5. Click Next.
The Pre-Installation Summary page appears.
6. Verify that all installation requirements are met and click Install.
The installer shows the progress of the installation. When the installation is complete, the Post-
Installation Summary page displays the status of the installation.
7. Click Done to close the installer.
For more information about the tasks performed by the installer, view the installation log files.

Registering the Plug-in


After you install or upgrade PowerExchange for Amazon Redshift, you must register the plug-in with the
PowerCenter repository.

A plug-in is an XML file that defines the functionality of PowerExchange for Amazon Redshift. To register the
plug-in, the repository must be running in exclusive mode. Use the Administrator tool or the pmrep
RegisterPlugin command to register the plug-in.

The plug-in file for PowerExchange for Amazon Redshift is AmazonRSCloudAdapterPlugin.xml. When you
install PowerExchange for Amazon Redshift, the installer copies the AmazonRSCloudAdapterPlugin.xml file
to the following directory:

<Informatica Installation Directory>\server\bin\Plugin

Note: If you do not have the correct privileges to register the plug-in, contact the user who manages the
PowerCenter Repository Service.

14 Chapter 2: PowerExchange for Amazon Redshift Installation and Configuration


CHAPTER 3

Amazon Redshift Sources and


Targets
This chapter includes the following topics:

• Amazon Redshift Sources and Targets, 15


• Import Amazon Redshift Objects, 15

Amazon Redshift Sources and Targets


Create a mapping with an Amazon Redshift source to read data from Amazon Redshift. Create a mapping
with any source and an Amazon Redshift target to write data to Amazon Redshift.

Import Amazon Redshift Objects


You can import Amazon Redshift source and target objects before you create a mapping.

1. Start PowerCenter Designer, and connect to a PowerCenter repository configured with an Amazon
Redshift instance.
2. Open a source or target folder.
3. Select Source Analyzer or Target Designer.
4. Click Sources or Targets, and then click Import from AmazonRSCloud Adapter.

15
The Establish Connection dialog box appears.
5. Specify the following information and click Connect.

Connection Property Description

Username User name of the Amazon Redshift account.

Password Password for the Amazon Redshift account.

Schema Amazon Redshift schema name.


Default is public.

AWS Access Key ID Amazon S3 bucket access key ID.

AWS Secret Access Key Amazon S3 bucket secret access key ID.

Master Symmetric Key Optional. Amazon S3 encryption key.


Provide a 256-bit AES encryption key in the Base64 format.

Cluster Node Type Node type of the Amazon Redshift cluster.


You can select the following options:
- dw1.xlarge
- dw1.8xlarge
- dw2.xlarge
- dw2.8xlarge
For more information about nodes in the cluster, see the Amazon Redshift
documentation.

16 Chapter 3: Amazon Redshift Sources and Targets


Connection Property Description

Number of Nodes in the Number of nodes in the Amazon Redshift cluster.


Cluster For more information about nodes in the cluster, see the Amazon Redshift
documentation.

JDBC URL Amazon Redshift connection URL.


If you configure the Amazon Redshift cluster for SSL, you can specify the
secure URL.

6. Click Next.
7. Select the table that you want to import, and then click Finish. If you want to see the table metadata,
select the table, and click the table name.

Import Amazon Redshift Objects 17


CHAPTER 4

Amazon Redshift Sessions


This chapter includes the following topics:

• Amazon Redshift Sessions Overview, 18


• Amazon Redshift Connections, 18
• Amazon Redshift Source Sessions, 20
• Amazon Redshift Target Sessions, 21
• Configuring HTTP Proxy Options at Design-Time, 25
• Success and Error Files, 25

Amazon Redshift Sessions Overview


You must configure an Amazon Redshift connection in the Workflow Manager to read data from or write data
to an Amazon Redshift table.

The PowerCenter Integration Service writes the data to a staging directory and then to an Amazon S3 bucket
before it writes the data to Amazon Redshift. You must specify the location of the staging directory in the
session properties. You must also specify an Amazon S3 bucket name in the session properties. You must
have write access to the Amazon S3 bucket.

Amazon Redshift Connections


Use an Amazon Redshift connection to connect to the Amazon Redshift database. The PowerCenter
Integration Service uses the connection when you run an Amazon Redshift session.

18
Amazon Redshift Connection Properties
When you configure an Amazon Redshift connection, you define the connection attributes that the
PowerCenter Integration Service uses to connect to the Amazon Redshift database.

The following table describes the application connection properties:

Property Description

Name Name of the Amazon Redshift connection.

Type The AmazonRSCloudAdapter connection type.

User Name User name to access the Amazon Redshift database.

Password Password for the Amazon Redshift database user name.

The following table describes the Amazon Redshift connection attributes:

Property Description

Schema Amazon Redshift schema name.


Default is public.

AWS Access Key ID Amazon S3 bucket access key ID.

AWS Secret Access Key Amazon S3 bucket secret access key ID.

Master Symmetric Key Optional. Amazon S3 encryption key.


Provide a 256-bit AES encryption key in the Base64 format.

Cluster Node Type Node type of the Amazon Redshift cluster.


For more information about clusters, see the Amazon Redshift documentation.

Number of Nodes in the Number of nodes in the Amazon Redshift cluster.


Cluster For more information about nodes in the cluster, see the Amazon Redshift
documentation.

JDBC URL Amazon Redshift connection URL.

Configuring an Amazon Redshift Connection


Configure an Amazon Redshift connection in the Workflow Manager to define the connection attributes that
the PowerCenter Integration Services uses to connect to the Amazon Redshift database.

1. In the Workflow Manager, click Connections > Application.


The Application Connection Browser dialog box appears.
2. Click New.
The Select Subtype dialog box appears.
3. Select AmazonRSCloudAdapter and click OK.
The Connection Object Definition dialog box appears.

Amazon Redshift Connections 19


4. Enter a name for the Amazon Redshift connection.
5. Enter the application properties for the connection.
6. Enter the Amazon Redshift connection attributes.
7. Click OK.

Configuring the Source Qualifier


When you import a source to create a mapping for Amazon Redshift source , you must configure the source
qualifier to create the mapping.

1. In the mapping, click Source Qualifier


2. Select the Configure tab
3. Specify the Amazon Redshift connection details.
4. Save the mapping.

Amazon Redshift Source Sessions


Create a mapping with an Amazon Redshift source and a target to read data from Amazon Redshift.

You can encrypt data, specify the location of the staging directory, and securely unload the results of a query
to files on Amazon Redshift.

Client-side Encryption
Client-side encryption is a technique to encrypt data while writing the data to Amazon S3.

When you enable client-side encryption, you must provide a Master Symmetric Key in the connection
properties. The PowerCenter Integration Service encrypts the data by using the Master Symmetric Key. The
Master Symmetric Key is a 256-bit AES encryption key in the Base64 format. PowerExchange for Amazon
Redshift uploads the data to the Amazon S3 server by using the Master Symmetric Key and then loads the
data by using the copy command with the Encrypted option and a private encryption key for additional
security.

To support encryption with maximum security, you must update the security policy .jar files
local_policy.jar and US_export_policy.jar. The .jar files are located at <Informatica Installation
Directory>\ava\jre\lib\security.

Identity Columns
An identity column contains unique values that are automatically generated.

Rules and Guidelines for Identity Columns


• The data type for an identity column must be either int or bigint.
• When you create a mapping for an insert operation, you must link either all the source and target identity
columns or none.
• When you create a mapping for an update, upsert or delete operation, you cannot map the identity
columns that are not part of the primary key.

20 Chapter 4: Amazon Redshift Sessions


• If an identity column is part of the primary key, you must map the column for update, upsert, and delete
operations, or the task fails. However, you cannot set a source value for these columns.
• The ExplicitID and MaxError count options are removed for the upsert, update, and delete operations.

Amazon Redshift Source Session Configuration


You can configure a session to read data from Amazon Redshift. Define the properties for each source in the
session.

The following table describes the session properties:

Advanced Property Description

S3 Bucket Name Amazon S3 bucket name for the Amazon Redshift source data.
Use an S3 bucket in the same region as your Amazon Redshift cluster.

Enable Compression Compresses staged files before writing the files to Amazon Redshift.
Task performance improves when the PowerCenter Integration Service compresses
the staged files.
Default is selected.

Staging Directory Amazon Redshift staging directory.


Location Specify a directory on the machine that hosts the PowerCenter Integration Service.

UnloadOptions Property Path to the property file.


File Enables you to add options to the unload command to write data from an Amazon
Redshift object to an S3 bucket.
You can add the following options:
- DELIMITER
- ESCAPE
- PARALLEL
For example, DELIMITER=|, ESCAPE and PARALLEL=OFF
Specify a directory on the machine that hosts the PowerCenter Integration Service.

Turn on S3 Client Side Indicates that the PowerCenter Integration Service encrypts data before writing the
Encryption data to Amazon S3 by using a private encryption key.

Amazon Redshift Target Sessions


Create a session and associate it with the mapping that you created to move data to an Amazon Redshift
table. Change the connection to an Amazon Redshift connection, and define the session properties to write
data to Amazon Redshift.

You can perform insert, update, delete, and upsert operations on an Amazon Redshift target.

Amazon Redshift Target Sessions 21


Amazon Redshift Staging Directory
The PowerCenter Integration Service creates a staging file in the directory that you specify in the session
properties. The PowerCenter Integration Service writes the data to the staging directory before it writes the
data to Amazon Redshift.

The PowerCenter Integration Service deletes the staged files from the staging directory after it writes the
data to Amazon S3. Specify a staging directory in the session properties with an appropriate amount of disk
space for the volume of data that you want to process. Specify a directory on the machine that hosts the
PowerCenter Integration Service.

The PowerCenter Integration Service creates subdirectories in the staging directory. Subdirectories use the
following naming convention:
<staging directory>/infaRedShiftStaging<MMDDHHmmSS>

Vacuum Tables
You can use vacuum tables to recover disk space and sorts rows in a specified table or all tables in the
database.

After you run bulk operations, such as delete or load, or after you run incremental updates, you must clean
the database tables to recover disk space and to improve query performance on Amazon Redshift. Amazon
Redshift does not reclaim and reuse free space when you delete and update rows.

You can configure vacuum table recovery options in the session properties. You can choose to recover disk
space for the entire database or for individual tables in a database. Vacuum databases or tables often to
maintain consistent query performance. You must run vacuum when you expect minimal activity on the
database or during designated database administration schedules. Long durations of vacuum might impact
database operations. Run vacuum often because large unsorted regions result in longer vacuum times.

You can enable the vacuum tables option when you configure a session that writes to Amazon Redshift. You
can select the following recovery options:
None
Does not sort rows or recover disk space.

Full
Sorts the specified table or all tables in the database and recovers disk space occupied by rows marked
for deletion by previous update and delete operations.

Sort Only
Sorts the specified table or all tables in the database without recovering space freed by deleted rows.

Delete Only
Recovers disk space occupied by rows marked for deletion by previous update and delete operations,
and compresses the table to free up used space.

Amazon Redshift Target Session Configuration


You can configure a session to write data to Amazon Redshift. Define the properties for each target instance
in the session.

22 Chapter 4: Amazon Redshift Sessions


The following table describes the session properties:

Property Description

S3 Bucket Name Amazon S3 bucket name for the Amazon Redshift target data.
Use an S3 bucket in the same region as your Amazon Redshift cluster.

Enable Compression Compresses staged files before writing the files to Amazon Redshift.
Session performance improves when the PowerCenter Integration Service compresses
the staged files.
Default is selected.

Staging Directory Amazon Redshift staging directory.


Location Specify a directory on the machine that hosts the PowerCenter Integration Service.

Batch Size Minimum number of rows in a batch. Enter a number greater than 0.
Default is 1000.

Max Redshift Errors Number of errors within a batch that causes a batch to fail. Enter a positive integer.
per Upload Batch for If the number of errors is equal to or greater than the property value, the PowerCenter
INSERT Integration Service writes the entire batch to the error file.
Default is 1.

Truncate Target Truncates an Amazon Redshift target before writing data to the target.
Table Before Data
Load

Null value for CHAR String value used to represent null values in CHAR and VARCHAR fields in Amazon
and VARCHAR data Redshift targets, such as NULL or a space character.
types Default is an empty string.

Wait time in seconds Number of seconds to wait for the PowerCenter Integration Service to make the staged
for file consistency files consistent with the list of files available on Amazon S3.
on S3 Default is 5.

CopyOptions Name of the property file.


Property File Enables you to add additional options to the copy command for writing data from an
Amazon S3 source to an Amazon Redshift target when the default delimiter comma (,) or
double-quote (") is used in the data.
You can add the following options:
- DELIMITER
- ACCEPTINVCHARS
- QUOTE
For example, DELIMITER=|, ACCEPTINVCHARS=#, and QUOTE=\'.
Specify a directory on the machine that hosts the PowerCenter Integration Service.

Turn on S3 Server Indicates that Amazon S3 encrypts data during upload and decrypts data at the time of
Side Encryption access.

Turn on S3 Client Indicates that the PowerCenter Integration Service encrypts data by using a private
Side Encryption encryption key.
If you enable both server side and client side encryption, the PowerCenter Integration
Service ignores the server side encryption.

Amazon Redshift Target Sessions 23


Property Description

Vacuum Target Recovers disk space and sorts rows in a specified table or all tables in the database.
Table You can select the following recovery options:
None

Does not sort rows or recover disk space.

Full

Sorts the specified table or all tables in the database and recovers disk space
occupied by rows marked for deletion by previous update and delete operations.

Sort Only

Sorts the specified table or all tables in the database without recovering space freed
by deleted rows.

Delete Only

Recovers disk space occupied by rows marked for deletion by previous update and
delete operations, and compresses the table to free up used space.
Default is None.

Analyze Target Table Improve the efficiency of the write operations.


The query planner on Amazon Redshift updates the statistical metadata to build and
choose optimal plans to improve the efficiency of queries.

INSERT If enabled, the PowerCenter Integration Service inserts all rows flagged for insert. If
disabled, the PowerCenter Integration Service rejects the rows flagged for insert.
By default, the insert operation is enabled.

DELETE If enabled, the PowerCenter Integration Service deletes all rows flagged for delete. If
disabled, the PowerCenter Integration Service rejects all rows flagged for delete.

UPDATE Performs update and upsert operations. To perform an update operation, you must map
the primary key column and at least one column other than primary key column.
You can select the following session target attributes:
Update as Update

The PowerCenter Integration Service updates all rows as updates.

Update else Insert

The PowerCenter Integration Service updates existing rows and inserts other rows as
if marked for insert.

Success File Directory for the Amazon Redshift success file.


Directory Specify a directory on the machine that hosts the PowerCenter Integration Service.

Error File Directory Directory for the Amazon Redshift error file.
Specify a directory on the machine that hosts the PowerCenter Integration Service.

24 Chapter 4: Amazon Redshift Sessions


Configuring HTTP Proxy Options at Design-Time
If your organization uses a proxy server to access the internet, you can configure the HTTP proxy server
authentication settings at design time to read data from or write data to Amazon Redshift. You can configure
the proxy server authentication settings in the jvmoptions.ini file.

Configuring HTTP Proxy Options


If your organization uses a proxy server to access the internet, you can configure the proxy server
authentication settings in the jvmoptions.ini file to read data from or write data to Amazon Redshift .

• Ensure that you enable the proxy server settings from your web browser.
• Access the jvmoptions.ini file from the following location: <Informatica Installation Location>
\clients\PowerCenterClient\client\bin
• Add the following properties to the jvmoptions.ini file:

Property Description

-Dhttp.proxyHost= Name of the HTTP proxy server.

-Dhttp.proxyPort= Port number of the HTTP proxy server.

-Dhttp.proxyUser= Authenticated user name for the HTTP proxy server. This is required if the proxy
server requires authentication.

-Dhttp.proxyPassword= Password for the authenticated user. This is required if the proxy server requires
authentication.
Note: The password is in plain text and not encrypted.

Success and Error Files


The PowerCenter Integration Service generates success and error files after you run a session. Success and
error files are .csv files that contain row-level details.

The PowerCenter Integration Service does not overwrite success or error files. You can manually delete the
files that you no longer need.

Success Files
The PowerCenter Integration Service generates a success file after you run a session. The success file
contains an entry for each record that successfully writes into Amazon Redshift. Each entry contains the
values that are written for all the fields of the record. Use this file to understand the data that the
PowerCenter Integration Service writes to the Amazon S3 bucket and then to the Amazon Redshift target.

If you want the PowerCenter Integration Service to generate a success file, specify a directory for the
success file in the session properties. Specify a directory on the machine that hosts the PowerCenter
Integration Service.

Configuring HTTP Proxy Options at Design-Time 25


Rules and Guidelines for Success Files
Consider the following guidelines when you configure the session properties for success files:

• If you do not specify a directory for the success file as a session property, the PowerCenter Integration
Service writes an empty file to the following directory: $PMTargetFileDir.
• The PowerCenter Integration Service generates the success file with the following naming convention:
<sessionName>_<timestamp>_success.csv.
• If you do not map a primary key column in a mapping, the insert, update, upsert, and delete operations
fail.

Error Files
The error file contains an entry for each data error. Each entry in the file contains the values for all fields of
the record and the error message. Use the error file to understand why the PowerCenter Integration Service
does not write data to the Amazon Redshift target.

If you want the PowerCenter Integration Service to generate an error file, specify a directory for the error file
in the session properties. You must not specify a network path or shared path.

Consider the following guidelines when you configure the session properties for error files:

• If you do not specify a directory for the error file as a session property, the PowerCenter Integration
Service writes a blank file to the following directory: $PMBadFileDir.
• The PowerCenter Integration Service generates errors file with the following naming convention:
<sessionName>_<timestamp>_error.csv.

Sample Error File


If a target table has the fields f_integer, f_char, and f_varchar, and if a row is rejected, the PowerCenter
Integration Service generates an error file in the following format:

Errors Details f_integer f_char f_varchar

"Query Start Time: 2014-03-24 11:41:30.629 "3" ""我Furniture ""001E000000SI3jIIAT""


Offending File: Values Intl
INSERT_bdt_with_composite_key.batch_0.csv. LLC_upd_upd""
0.gz Line Number: 4 Column Name: f_char Column
Type: char Offending Value: .......Furniture Values
Intl LLC_upd_upd ERROR Reason: Multibyte
character not supported for CHAR (Hint: try using
VARCHAR). Invalid char: c3 a6"

"Query Start Time: 2014-03-24 11:42:00.763 "9999" ""我Heitkamp ""001E000000SHd7ZIAT""


Offending File: Inc_upd_upd""
INSERT_bdt_with_composite_key.batch_8.csv.
0.gz Line Number: 80 Column Name: f_char
Column Type: char Offending Value: .......Heitkamp
Inc_upd_upd ERROR Reason: Multibyte character
not supported for CHAR (Hint: try using
VARCHAR). Invalid char: c3 a6"

26 Chapter 4: Amazon Redshift Sessions


APPENDIX A

Amazon Redshift Data Type


Reference
This appendix includes the following topics:

• Data Type Reference Overview, 27


• Amazon Redshift and Transformation Data Types, 27

Data Type Reference Overview


When you run the session to read data from or write data to Amazon Redshift, the PowerCenter Integration
Service converts the transformation data types to comparable native Amazon Redshift data types.

Amazon Redshift and Transformation Data Types


The Amazon Redshift data types are the names and the aliases represent how the PowerCenter Integration
Service stores the data types.

For example, SMALLINT is the Amazon Redshift data type name. The data type is stored as a 2-byte integer.
Here, SMALLINT is the Amazon Redshift data type name and INT2 is the Amazon Redshift data type alias.

The following table compares the Amazon Redshift data types and the transformation data types:

Amazon Redshift Amazon Redshift Data Description Transformation Data


Data Type Type Aliases Type

SMALLINT INT2 Signed two-byte integer. Small Integer

INTEGER INT, INT4 Signed four-byte integer. Integer

BIGINT INT8 Signed eight-byte integer. Bigint

DECIMAL NUMERIC Exact numeric of selectable Decimal


precision.

27
Amazon Redshift Amazon Redshift Data Description Transformation Data
Data Type Type Aliases Type

REAL FLOAT4 Single precision floating-point Double


number.

DOUBLE FLOAT8, FLOAT Double precision floating-point Double


PRECISION number.

BOOLEAN BOOL Logical Boolean (true/false). Small Integer

CHAR CHARACTER, NCHAR, Fixed-length character string. String


BPCHAR

VARCHAR CHARACTER VARYING, Variable-length character string String


NVARCHAR, TEXT with a user-defined limit.

DATE NA Calendar date (year, month, day). Timestamp

TIMESTAMP TIMESTAMP WITHOUT Date and time (without time zone). Timestamp
TIME ZONE

28 Appendix A: Amazon Redshift Data Type Reference


Index

A P
Amazon Redshift plug-ins
connection 18, 19 registering PowerExchange for Redshift 14
sessions overview 18 prerequisites 12
source session properties 21
target session properties 22
Amazon Redshift data types
comparing with transformation data types 27 R
overview 27 rules and guidelines
success files
error files 25

C
Configure
Amazon Redshift connection 19 S
source details 15
success file 25

E
error file 26

I
installing
PowerExchange for Amazon Redshift 13

29

You might also like