Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 10 hours 20 min ago

AWS – RDS – SQL Server Native backups

Thu, 2020-06-04 17:22
Introduction

RDS provide automatic backup feature. This feature will backup the entire RDS instance. As a DBA sometimes you need to backup individual database, they are many reason for that (legal, migration, replication,…). Unfortunately individual database backups are not available within the RDS instance.

This post explains how you can enable the native database backup on RDS , as you are used to with an on-premises SQL Server instance.
To summarize we will create a S3 bucket on AWS to store the backups, create IAM role having the mandatory permission on the S3 bucket, create an RDS Option Group associated with the role and having the SQLSERVER_BACKUP_RESTORE option.

Of course you need to have a existing RDS instance running. I have one with a SQL Server 2017 EE.

Create a S3 bucket

If you do not already have one, create first a S3 bucket that will be your repository for the saving your database backups.
Open your s3 management console and click [Create bucket]

Enter a S3 bucket name and select the region where your RDS instance is located.
When done click [Create]

Create a IAM role

Open the AWS IAM management console and select [Policies} in the navigation pane and click [Create policy]

Select the Json tab and copy the following script to replace the existing one

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "VisualEditor0",
			"Effect": "Allow",
			"Action": [
			"s3:ListBucket",
			"s3:GetBucketLocation"
			],
			"Resource": "arn:aws:s3:::dbi-sql-backup"
		},
		{
			"Sid": "VisualEditor1",
			"Effect": "Allow",
			"Action": [
			"s3:PutObject",
			"s3:GetObject",
			"s3:AbortMultipartUpload",
			"s3:ListMultipartUploadParts"
			],
			"Resource": "arn:aws:s3:::dbi-sql-backup/*"
		}
	]
}

In the script do not forget to replace the S3 bucket name with the one you created previously and click [Review policy]

Set a name for your policy and finish the creation with [Create policy]

Now, select [Roles] in the navigation pane. Click the option [Create role]

Select the [AWS service] option, then the [EC2] and finally again [EC2] in the use case list and click [Next:Permissions]

Search the policy you create and select it. Then click the [Next:Tags]. I recommend to capture some tags, to later be able to identify it easier. Click then [Next:Review]

Key in your role name and finally press [Create Role]

When created edit again the role by selecting it in the role list

Select the [Trusted relationship] tab and edit it.

Replace the line “Service”: “ec2.amazonaws.com” with “Service”: “rds.amazonaws.com” as we want this role to be active in RDS.
Click [Update Trust Policy]

Create an Option Group

Next step is to create an option group.
To do so, open the RDS management console and select [Option groups] in the navigation pane and click on [Create group]

Set the name of your option group and select the engine version and Major Engine Version of your RDS instance and create your option group.

Now select your option group in the list and add an option.

In the option detail section select SQLSERVER_BACKUP_RESTORE and in the IAM section select the role you created previously. In the scheduling option choose the option you want. In my case I want it to be applied immediately. Then click [Add option].

Link your RDS instance with your option group

The last configuration step is to couple your RDS instance with the create option group

In the RDS management console select [Databases] in the navigation pane and select your RDS instance for which you need native backup to be activated. Make sure a well that the version of the RDS instance match the one set in the option group you created previously. Check if you instance is available and then click [Modify]

Scroll down until the Database options section. In the option group combo box select the option group your created previously and then click [Continue] at the bottom of the page.

Choose when you want to apply the modification. Be aware that if you select “Apply immediately” your RDS instance will restart and you will have an interruption of the service.

Test the Backup

Connect to you RDS SQL Server instance using for exemple Microsoft SQL Sever Management Studio.
There is a stored procedure in the [msdb] named [dbo].[rds_backup_database] that you must use to start your native database backup

USE [msdb]

EXECUTE [dbo].[rds_backup_database]
@source_db_name = 'ProductDB'
,@s3_arn_to_backup_to = 'arn:aws:s3:::dbi-sql-backup/awsvdata_ProductDB.bak'
--,@kms_master_key_arn
--,@overwrite_s3_backup_file
--,@type
--,@number_of_files
GO

 

Adapt the script with your database name and the path of your S3 bucket with the backupfile name. Note that I did not used all parameters of the stored procedure in this post.
The result of the stored procedure execution will give you a task_id associated with your command.

With the task_if, you can follow up the status of the process with the following stored procedure:

Use [msdb]
execute rds_task_status @task_id = 4

Conclusion

Enabling Native database backup is indeed very practical. Unfortunately there are some limitations.

For instance there is no differential, transaction log or filegroup backup or restore possibility that could be very useful in many scenario.

Cet article AWS – RDS – SQL Server Native backups est apparu en premier sur Blog dbi services.

Oracle 18c – select from a flat file

Thu, 2020-06-04 16:09
By Franck Pachot

.
This post is the first one from a series of small examples on recent Oracle features. My goal is to present them to people outside of Oracle and relational databases usage, maybe some NoSQL players. And this is why the title is “select from a flat-file” rather than “Inline External Tables”. In my opinion, the names of the features of Oracle Database are invented by the architects and developers, sometimes renamed by Marketing or CTO, and all that is very far from what the users are looking for. In order to understand “Inline External Table” you need to know all the history behind: there were tables, then external tables, and there were queries, and inlined queries, and… But imagine a junior who just wants to query a file, he will never find this feature. He has a file, it is not a table, it is not external, and it is not inline. What is external to him is this SQL language and what we want to show him is that this language can query his file.

I’m running this in the Oracle 20c preview in the Oracle Cloud.

In this post, my goal is to load a small fact and dimension table for the next posts about some recent features that are interesting in data warehouses. It is the occasion to show that with Oracle we can easily select from a .csv file, without the need to run SQL*Loader or create an external table.
I’m running everything from SQLcl and then I use the host command to call curl:


host curl -L http://opendata.ecdc.europa.eu/covid19/casedistribution/csv/ | dos2unix | sort -r > /tmp/covid-19.csv

This gets the latest number of COVID-19 cases per day and per country.

It looks like this:


SQL> host head  /tmp/covid-19.csv
dateRep,day,month,year,cases,deaths,countriesAndTerritories,geoId,countryterritoryCode,popData2018,continentExp
31/12/2019,31,12,2019,27,0,China,CN,CHN,1392730000,Asia
31/12/2019,31,12,2019,0,0,Vietnam,VN,VNM,95540395,Asia
31/12/2019,31,12,2019,0,0,United_States_of_America,US,USA,327167434,America
31/12/2019,31,12,2019,0,0,United_Kingdom,UK,GBR,66488991,Europe
31/12/2019,31,12,2019,0,0,United_Arab_Emirates,AE,ARE,9630959,Asia
31/12/2019,31,12,2019,0,0,Thailand,TH,THA,69428524,Asia
31/12/2019,31,12,2019,0,0,Taiwan,TW,TWN,23780452,Asia
31/12/2019,31,12,2019,0,0,Switzerland,CH,CHE,8516543,Europe
31/12/2019,31,12,2019,0,0,Sweden,SE,SWE,10183175,Europe

I sorted them on date on purpose (next posts may talk about data clustering).

I need a directory object to access the file:


SQL> create or replace directory "/tmp" as '/tmp';

Directory created.

You don’t have to use quoted identifiers if you don’t like it. I find it convenient here.

I can directly select from the file, the EXTERNAL clause mentioning what we had to put in an external table before 18c:


SQL> select *
   from external (
    (
     dateRep                    varchar2(10)
     ,day                       number
     ,month                     number
     ,year                      number
     ,cases                     number
     ,deaths                    number
     ,countriesAndTerritories   varchar2(60)
     ,geoId                     varchar2(30)
     ,countryterritoryCode      varchar2(3)
     ,popData2018               number
     ,continentExp              varchar2(30)
    )
    default directory "/tmp"
    access parameters (
     records delimited by newline skip 1 -- skip header
     logfile 'covid-19.log'
     badfile 'covid-19.bad'
     fields terminated by "," optionally enclosed by '"'
    )
    location ('covid-19.csv')
    reject limit 0
   )
 .

SQL> /
      DATEREP    DAY    MONTH    YEAR    CASES    DEATHS                       COUNTRIESANDTERRITORIES       GEOID    COUNTRYTERRITORYCODE    POPDATA2018    CONTINENTEXP
_____________ ______ ________ _______ ________ _________ _____________________________________________ ___________ _______________________ ______________ _______________
01/01/2020         1        1    2020        0         0 Algeria                                       DZ          DZA                           42228429 Africa
01/01/2020         1        1    2020        0         0 Armenia                                       AM          ARM                            2951776 Europe
01/01/2020         1        1    2020        0         0 Australia                                     AU          AUS                           24992369 Oceania
01/01/2020         1        1    2020        0         0 Austria                                       AT          AUT                            8847037 Europe
01/01/2020         1        1    2020        0         0 Azerbaijan                                    AZ          AZE                            9942334 Europe
01/01/2020         1        1    2020        0         0 Bahrain                                       BH          BHR                            1569439 Asia
ORA-01013: user requested cancel of current operation

SQL>

I cancelled it as that’s too long to display here.

As the query is still in the buffer, I just add a CREATE TABLE in front of it:


SQL> 1
  1* select *
SQL> c/select/create table covid as select/
   create table covid as select *
  2   from external (
  3    (
  4     dateRep                    varchar2(10)
  5     ,day                       number
...

SQL> /

Table created.

SQL>

While I’m there I’ll quickly create a fact table and a dimension hierarchy:


SQL> create table continents as select rownum continent_id, continentexp continent_name from (select distinct continentexp from covid where continentexp!='Other');

Table created.

SQL> create table countries as select country_id,country_code,country_name,continent_id from (select distinct geoid country_id,countryterritorycode country_code,countriesandterritories country_name,continentexp continent_name from covid where continentexp!='Other') left join continents using(continent_name);

Table created.

SQL> create table cases as select daterep, geoid country_id,cases from covid where continentexp!='Other';

Table created.

SQL> alter table continents add primary key (continent_id);

Table altered.

SQL> alter table countries add foreign key (continent_id) references continents;

Table altered.

SQL> alter table countries add primary key (country_id);

Table altered.

SQL> alter table cases add foreign key (country_id) references countries;

Table altered.

SQL> alter table cases add primary key (country_id,daterep);

Table altered.

SQL>

This create a CASES fact table with only one measure (covid-19 cases) and two dimensions. To get it simple, the date dimension here is just a date column (you usually have a foreign key to a calendar dimension). The geographical dimension is a foreign key to the COUNTRIES table which itself has a foreign key referencing the CONTINENTS table.

12c Top-N queries

In 12c we have a nice syntax for Top-N queries with the FETCH FIRST clause of the ORDER BY:


SQL> select continent_name,country_code,max(cases) from cases join countries using(country_id) join continents using(continent_id) group by continent_name,country_code order by max(cases) desc fetch first 10 rows only;

CONTINENT_NAME                 COU MAX(CASES)
------------------------------ --- ----------
America                        USA      48529
America                        BRA      33274
Europe                         RUS      17898
Asia                           CHN      15141
America                        ECU      11536
Asia                           IND       9304
Europe                         ESP       9181
America                        PER       8875
Europe                         GBR       8719
Europe                         FRA       7578

10 rows selected.

This returns the 10 countries which had the maximum covid-19 cases per day.

20c WINDOW clauses

If I want to show the date with the maximum value, I can use analytic functions and in 20c I don’t have to repeat the window several times:


SQL> select continent_name,country_code,top_date,top_cases from (
  2   select continent_name,country_code,daterep,cases
  3    ,first_value(daterep)over(w) top_date
  4    ,first_value(cases)over(w) top_cases
  5    ,row_number()over(w) r
  6    from cases join countries using(country_id) join continents using(continent_id)
  7    window w as (partition by continent_id order by cases desc)
  8   )
  9   where r=1 -- this to get the rows with the highes value only
 10   order by top_cases desc fetch first 10 rows only;

CONTINENT_NAME                 COU TOP_DATE    TOP_CASES
------------------------------ --- ---------- ----------
America                        USA 26/04/2020      48529
Europe                         RUS 02/06/2020      17898
Asia                           CHN 13/02/2020      15141
Africa                         ZAF 30/05/2020       1837
Oceania                        AUS 23/03/2020        611

The same can be done before 20c but you have to write the (partition by continent_id order by cases desc) for each projection.

In the next post I’ll show a very nice feature. Keeping the 3 tables normalized data model but, because storage is cheap, materializing some pre-computed joins. If you are a fan of NoSQL because “storage is cheap” and “joins are expensive”, then you will see what we can do with SQL in this area…

Cet article Oracle 18c – select from a flat file est apparu en premier sur Blog dbi services.

Oracle 12c – pre-built join index

Thu, 2020-06-04 16:09
By Franck Pachot

.
This post is part of a series of small examples of recent features. I’m running this in the Oracle 20c preview in the Oracle Cloud. I have created a few tables in the previous post with a mini-snowflake scheme: a fact table CASES with the covid-19 cases per country and day. And a dimension hierarchy for the country with COUNTRIES and CONTINENTS tables.

This title may look strange for people used to Oracle. I am showing the REFRESH FAST ON STATEMENT Materialized View clause here, also known as “Synchronous Refresh for Materialized Views”. This name makes sense only when you already know materialized views, complete and fast refreshes, on commit and on-demand refreshes… But that’s not what people will look for. Indexes are also refreshed by the statements, synchronously. Imagine that they were called “Synchronous Refresh for B*Trees”, do you think they would have been so popular?

A materialized view, like an index, is a redundant structure where data is stored in a different physical layout in order to be optimal for alternative queries. For example, you ingest data per date (which is the case in my covid-19 table – each day a new row with the covid-19 cases per country). But if I want to query all points for a specific country, those are scattered though the physical segment that is behind the table (or the partition). With an index on the country_code, I can identify easily one country, because the index is sorted on the country. I may need to go to the table to get the rows, and that is expensive, but I can avoid it by adding all the attributes in the index. With Oracle, as with many databases, we can build covering indexes, for real index-only access, even if they don’t mention those names.

But with my snowflake schema, I’ll not have the country_code in the fact table and I have to join to a dimension. This is more expensive because the index on the country_name will get the country_id and then I have to go to an index on the fact table to get the rows for this country_id. When it comes to joins, I cannot index the result of the join (I’m skipping bitmap join indexes here because I’m talking about covering indexes). What I would like is an index with values from multiple tables.

A materialized view can achieve much more than an index. We can build the result of the join in one table. And no need for event sourcing or streaming here to keep it up to date. No need to denormalize and risk inconsistency. When NoSQL pioneers tell you that storage is cheap and redundancy is the way to scale, just keep your relational database for integrity and build materialized views on top. When they tell you that joins are expensive, just materialize them upfront. Before 12c, keeping those materialized views consistent with the source required either:

  1. materialized view logs which is similar to event sourcing except that ON COMMIT refresh is strongly consistent
  2. partition change tracking which is ok for bulk changes, when scaling big data

This is different from indexes which are maintained immediately: when you update the row, the index is synchronized because your session has the values and the rowid and can go directly to update the index entry.

refresh fast on statement

In 12c you have the benefit from both: index-like fast maintenance with rowid access, and the MView possibility of querying pre-build joins. Here is an example on the tables created in the previous post.


SQL> create materialized view flatview refresh fast on statement as
  2  select daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0;

select daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0
                                                                                                                                             *
ERROR at line 2:
ORA-12015: cannot create a fast refresh materialized view from a complex query

There are some limitations when we want fast refresh and we have a utility to help us understand what we have to change or add in our select clause.

explain_mview

I need to create the table where the messages will be written to by this utility:


@ ?/rdbms/admin/utlxmv

SQL> set sqlformat ansiconsole
SQL> set pagesize 10000

This has created mv_capabilities_table and I can run dbms_mview.explain_mview() now.

Here is the call, with the select part of the materialized view:


SQL> exec dbms_mview.explain_mview('-
  2  select daterep,continent_name,country_name,cases from cases join countries using(country_id) join continents using(continent_id) where cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;

   ?                  CAPABILITY_NAME    RELATED_TEXT                                                                 MSGTXT
____ ________________________________ _______________ ______________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT                        inline view or subquery in FROM list not supported for this type MV
N    REFRESH_FAST_AFTER_INSERT                        inline view or subquery in FROM list not supported for this type MV
N    REFRESH_FAST_AFTER_INSERT                        view or subquery in from list
N    REFRESH_FAST_AFTER_ONETAB_DML                    see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                       see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                 PCT FAST REFRESH is not possible if query contains an inline view

SQL> rollback;

Rollback complete.

“inline view or subquery in FROM list not supported for this type MV” is actually very misleading. I use ANSI joins and they are translated to query blocks and this is not supported.

No ANSI joins

I rewrite it with the old join syntax:


SQL> exec dbms_mview.explain_mview('-
  2  select daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        CONTINENTS         the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

Now I need to add the ROWID of the table CONTINENTS in the materialized view.

ROWID for all tables

Yes, as I mentioned, the gap between indexes and materialized views is shorter. The REFRESH FAST ON STATEMENT requires access by rowid to update the materialized view, like when a statement updates an index.


SQL> exec dbms_mview.explain_mview('-
  2  select continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        COUNTRIES          the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view
SQL> rollback;

Rollback complete.

Now, the ROWID for COUNTRIES.

I continue the and finally I’ve added ROWID for all tables involved:


SQL> exec dbms_mview.explain_mview('-
  2  select continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        COUNTRIES          the SELECT list does not have the rowids of all the detail tables
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

SQL> exec dbms_mview.explain_mview('-
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0-
  3  ');

PL/SQL procedure successfully completed.

SQL> select possible "?",capability_name,related_text,msgtxt from mv_capabilities_table where capability_name like 'REFRESH_FAST%' order by seq;
   ?                  CAPABILITY_NAME       RELATED_TEXT                                                                      MSGTXT
____ ________________________________ __________________ ___________________________________________________________________________
N    REFRESH_FAST
N    REFRESH_FAST_AFTER_INSERT        DEMO.CASES         the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.COUNTRIES     the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_INSERT        DEMO.CONTINENTS    the detail table does not have a materialized view log
N    REFRESH_FAST_AFTER_ONETAB_DML                       see the reason why REFRESH_FAST_AFTER_INSERT is disabled
N    REFRESH_FAST_AFTER_ANY_DML                          see the reason why REFRESH_FAST_AFTER_ONETAB_DML is disabled
N    REFRESH_FAST_PCT                                    PCT is not possible on any of the detail tables in the materialized view

SQL> rollback;

Rollback complete.

Ok, now another message: “the detail table does not have a materialized view log”. But that’s exactly the purpose of statement-level refresh: being able to fast refresh without creating and maintaining materialized view logs, and without full-refreshing a table or a partition.

This’t the limit of DBMS_MVIEW.EXPLAIN_MVIEW. Let’s try to create the materialized view now:


SQL> create materialized view flatview refresh fast on statement as
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0;

select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0
                                                                                                                                                                                                                                                                                    *
ERROR at line 2:
ORA-32428: on-statement materialized join view error: Shape of MV is not
supported(composite PK)

SQL>

That’s clear. I had created the fact primary key on the compound foreign keys.

Surrogate key on fact table

This is not allowed by statement-level refresh, so let’s change that:


SQL> alter table cases add (case_id number);

Table altered.

SQL> update cases set case_id=rownum;

21274 rows updated.

SQL> alter table cases drop primary key;

Table altered.

SQL> alter table cases add primary key(case_id);

Table altered.

SQL> alter table cases add unique(daterep,country_id);
Table altered.

I have added a surrogate key and defined a unique key for the composite one.

Now the creation is sucessful:


SQL> create materialized view flatview refresh fast on statement as
  2  select cases.rowid case_rowid,countries.rowid country_rowid,continents.rowid continent_rowid,daterep,continent_name,country_name,cases from cases , countries , continents where cases.country_id=countries.country_id and countries.continent_id=continents.continent_id and cases>0;

Materialized view created.

Note that I tested later and I am able to create it with the ROWID from the fact table CASES only. But that’s not a good idea: in order to propagate any change to the underlying tables, the materialized view must have the ROWID, like an index. I consider as a bug the possibility to do it.

Here are the columns stored in my materialized view:


SQL> desc flatview

              Name    Null?            Type
__________________ ________ _______________
CASE_ROWID                  ROWID
COUNTRY_ROWID               ROWID
CONTINENT_ROWID             ROWID
DATEREP                     VARCHAR2(10)
CONTINENT_NAME              VARCHAR2(30)
COUNTRY_NAME                VARCHAR2(60)
CASES                       NUMBER

Storing the ROWID is not something we should recommend as some maintenance operations may change the physical location of rows. You will need to complete refresh the materialized view after an online move for example.

No-join query

I’ll show query rewrite in another blog post. For the moment, I’ll query this materialized view directly.

Here is a query similar to the one in the previous post:


SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8   where r=1 order by top_cases
  9  ;

   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              China                       13/02/2020           15141
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

I have replaced the country_id and continent_id by their name as I didn’t put them in my materialized view. And I repeated the window function everywhere if you want to run the same in versions lower than 20c.

This materialized view is a table. I can partition it by hash to scatter the data. I can cluster on another column. I can add indexes. I have the full power of a SQL databases on it, without the need to join if you think that joins are slow. If you come from NoSQL you can see it like a DynamoDB global index. You can query it without joining, fetching all attributes with one call, and filtering on another key than the primary key. But here we have always strong consistency: the changes are replicated immediately, fully ACID. They will be committed or rolled back by the same transaction that did the change. They will be replicated synchronously or asynchronously with read-only replicas.

DML on base tables

Let’s do some changes here, lowering the covid-19 cases of CHN to 42%:


SQL> alter session set sql_trace=true;

Session altered.

SQL> update cases set cases=cases*0.42 where country_id=(select country_id from countries where country_code='CHN');

157 rows updated.

SQL> alter session set sql_trace=false;

Session altered.

I have set sql_trace because I want to have a look at the magic behind it.

Now running my query on the materialized view:



SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8*  where r=1 order by top_cases;

   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              India                       04/06/2020            9304
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

CHN is not the top one in Asia anymore with the 42% correction.

The changes were immediately propagated to the materialized view like when indexes are updated, and we can see that in the trace:


SQL> column value new_value tracefile
SQL> select value from v$diag_info where name='Default Trace File';
                                                                     VALUE
__________________________________________________________________________
/u01/app/oracle/diag/rdbms/cdb1a_iad154/CDB1A/trace/CDB1A_ora_49139.trc


SQL> column value clear
SQL> host tkprof &tracefile trace.txt

TKPROF: Release 20.0.0.0.0 - Development on Thu Jun 4 15:43:13 2020

Copyright (c) 1982, 2020, Oracle and/or its affiliates.  All rights reserved.

sql_trace instruments all executions with time and number of rows. tkprof aggregates those for analysis.

The trace shows two statements on my materialized view: DELETE and INSERT.

The first one is about removing the modified rows.


DELETE FROM "DEMO"."FLATVIEW"
WHERE
 "CASE_ROWID" = :1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      157      0.00       0.00          0          0          0           0
Execute    157      0.01       0.04         42        314        433         141
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      314      0.01       0.04         42        314        433         141

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 634     (recursive depth: 1)
Number of plan statistics captured: 3

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  DELETE  FLATVIEW (cr=2 pr=1 pw=0 time=395 us starts=1)
         1          1          1   INDEX UNIQUE SCAN I_OS$_FLATVIEW (cr=2 pr=1 pw=0 time=341 us starts=1 cost=1 size=10 card=1)(object id 78628)

This has been done row-by-row but is optimized with an index on ROWID that has been created autonomously with my materialized view.

The second one is inserting the modified rows:


INSERT INTO  "DEMO"."FLATVIEW" SELECT "CASES".ROWID "CASE_ROWID",
  "COUNTRIES".ROWID "COUNTRY_ROWID","CONTINENTS".ROWID "CONTINENT_ROWID",
  "CASES"."DATEREP" "DATEREP","CONTINENTS"."CONTINENT_NAME" "CONTINENT_NAME",
  "COUNTRIES"."COUNTRY_NAME" "COUNTRY_NAME","CASES"."CASES" "CASES" FROM
  "CONTINENTS" "CONTINENTS","COUNTRIES" "COUNTRIES", (SELECT "CASES".ROWID
  "ROWID","CASES"."DATEREP" "DATEREP","CASES"."CASES" "CASES",
  "CASES"."COUNTRY_ID" "COUNTRY_ID" FROM "DEMO"."CASES" "CASES" WHERE
  "CASES".ROWID=(:Z)) "CASES" WHERE "CASES"."COUNTRY_ID"=
  "COUNTRIES"."COUNTRY_ID" AND "COUNTRIES"."CONTINENT_ID"=
  "CONTINENTS"."CONTINENT_ID" AND "CASES"."CASES">0


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      157      0.00       0.01          0          0          0           0
Execute    157      0.01       0.02          0        755        606         141
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      314      0.02       0.03          0        755        606         141

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 634     (recursive depth: 1)
Number of plan statistics captured: 3

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  LOAD TABLE CONVENTIONAL  FLATVIEW (cr=8 pr=0 pw=0 time=227 us starts=1)
         1          1          1   NESTED LOOPS  (cr=5 pr=0 pw=0 time=29 us starts=1 cost=3 size=47 card=1)
         1          1          1    NESTED LOOPS  (cr=3 pr=0 pw=0 time=20 us starts=1 cost=2 size=37 card=1)
         1          1          1     TABLE ACCESS BY USER ROWID CASES (cr=1 pr=0 pw=0 time=11 us starts=1 cost=1 size=17 card=1)
         1          1          1     TABLE ACCESS BY INDEX ROWID COUNTRIES (cr=2 pr=0 pw=0 time=9 us starts=1 cost=1 size=20 card=1)
         1          1          1      INDEX UNIQUE SCAN SYS_C009401 (cr=1 pr=0 pw=0 time=4 us starts=1 cost=0 size=0 card=1)(object id 78620)
         1          1          1    TABLE ACCESS BY INDEX ROWID CONTINENTS (cr=2 pr=0 pw=0 time=5 us starts=1 cost=1 size=10 card=1)
         1          1          1     INDEX UNIQUE SCAN SYS_C009399 (cr=1 pr=0 pw=0 time=2 us starts=1 cost=0 size=0 card=1)(object id 78619)

Again, a row-by-row insert apparently as the “execute count” is nearly the same as the “rows count”. 157 is the number of rows I have updated.

You may think that this is a huge overhead, but those operations are optimized for a long time. The materialized view is refreshed and ready for optimal queries: no need to queue, stream, reorg, vacuum,… And I can imagine that if this feature is used, it will be optimized with bulk operations which would allow compression.

Truncate

This looks all good. But… what happens if I truncate the table?


SQL> truncate table cases;

Table truncated.

SQL> select continent_name,country_name,top_date,top_cases from (
  2   select continent_name,country_name,daterep,cases
  3    ,first_value(daterep)over(partition by continent_name order by cases desc) top_date
  4    ,first_value(cases)over(partition by continent_name order by cases desc)top_cases
  5    ,row_number()over(partition by continent_name order by cases desc) r
  6    from flatview
  7   )
  8*  where r=1 order by top_cases;
   CONTINENT_NAME                COUNTRY_NAME      TOP_DATE    TOP_CASES
_________________ ___________________________ _____________ ____________
Oceania           Australia                   23/03/2020             611
Africa            South_Africa                30/05/2020            1837
Asia              India                       04/06/2020            9304
Europe            Russia                      02/06/2020           17898
America           United_States_of_America    26/04/2020           48529

Nothing changed. This is dangerous. You need to refresh it yourself. This may be a bug. What will happen if you insert data back? Note that, like with triggers, direct-path inserts will be transparently run as conventional inserts.

Joins are not expensive

This feature is really good to pre-build the joins in a composition of tables, as a hierarchical key-value, or snowflake dimension fact table. You can partition, compress, order, filter, index,… as with any relational table. There no risk here with the denormalization as it is transparently maintained when you update the underlying tables.

If you develop on a NoSQL database because you have heard that normalization was invented to reduce storage, which is not nexpensive anymore, that’s a myth (you can read this long thread to understand the origin of this myth). Normalization is about database integrity and separation lof logical and physical layers. And that’s what Oracle Database implements with this feature: you update the logical view, tables are normalized for integrity, and the physical layer transparently maintains additional structures like indexes and materialized views to keep queries under single-digit milliseconds. Today you still need to think about indexes and materialized views to build. Some advisors may help. All those are the bricks for the future: an autonomous database where you define only the logical layer for your application and all those optimisations will be done in background.

Cet article Oracle 12c – pre-built join index est apparu en premier sur Blog dbi services.

Functions in SQL with the Multitenant Containers Clause

Wed, 2020-06-03 08:27
By Clemens Bleile

To prepare a presentation about Multitenant Tuning I wanted to see the METHOD_OPT dbms_stats global preference of all my pluggable DBs. In this specific case I had 3 PBDs called pdb1, pdb2 and pdb3 in my CDB. For testing purposes I changed the global preference in pdb1 from its default ‘FOR ALL COLUMNS SIZE AUTO’ to ‘FOR ALL INDEXED COLUMNS SIZE AUTO’:

c##cbleile_adm@orclcdb@PDB1> exec dbms_stats.set_global_prefs('METHOD_OPT','FOR ALL INDEXED COLUMNS SIZE AUTO');
c##cbleile_adm@orclcdb@PDB1> select dbms_stats.get_prefs('METHOD_OPT') from dual;

DBMS_STATS.GET_PREFS('METHOD_OPT')
------------------------------------
FOR ALL INDEXED COLUMNS SIZE AUTO

Afterwards I ran my SQL with the containers clause from the root container:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, dbms_stats.get_prefs('METHOD_OPT') method_opt from containers(dual);

    CON_ID METHOD_OPT
---------- --------------------------------
         1 FOR ALL COLUMNS SIZE AUTO
         3 FOR ALL COLUMNS SIZE AUTO
         4 FOR ALL COLUMNS SIZE AUTO
         5 FOR ALL COLUMNS SIZE AUTO

4 rows selected.

For CON_ID 3 I expected to see “FOR ALL INDEXED COLUMNS SIZE AUTO”. What is wrong here?

I actually got it to work with the following query:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, method_opt from containers(select dbms_stats.get_prefs('METHOD_OPT') method_opt from dual);

    CON_ID METHOD_OPT
---------- ----------------------------------
         1 FOR ALL COLUMNS SIZE AUTO
         4 FOR ALL COLUMNS SIZE AUTO
         5 FOR ALL COLUMNS SIZE AUTO
         3 FOR ALL INDEXED COLUMNS SIZE AUTO

4 rows selected.

That is interesting. First of all I didn’t know that you can actually use SELECT-statements in the containers clause (according the syntax diagram it has to be a table or a view-name only) and secondly the function dbms_stats.get_prefs in the first example has obviously been called in the root container after getting the data.

I verified that last statement with a simple test by creating a function in all containers, which just returns the container id of the current container:


create or replace function c##cbleile_adm.show_con_id return number
as
conid number;
begin
     select to_number(sys_context('USERENV', 'CON_ID')) into conid from sys.dual;
     return conid;
  end;
/

And then the test:


c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, show_con_id from containers(dual);

    CON_ID SHOW_CON_ID
---------- -----------
         1           1
         3           1
         4           1
         5           1

4 rows selected.

c##cbleile_adm@orclcdb@CDB$ROOT> select con_id, show_con_id from containers(select show_con_id from dual);

    CON_ID SHOW_CON_ID
---------- -----------
         4           4
         1           1
         3           3
         5           5

4 rows selected.

That proved that the function in the select-list of the first statement is actually called in the root container after getting the data from the PDBs.

Summary:
– be careful when running the containers-clause in a select-statement with a function in the select-list. You may get unexpected results.
– the syntax with a select-statement in the containers clause is interesting.

REMARK: Above tests have been performed with Oracle 19.6.

Cet article Functions in SQL with the Multitenant Containers Clause est apparu en premier sur Blog dbi services.

Introduction to Azure SQL Database Auto-failover groups

Wed, 2020-06-03 01:30

SQL Azure Database by default offers a 99.99% availability SLA across all its service tiers. This means that for any database, the downtime should not exceed 52 minutes per year. Using Zone redundancy increases availability to 99.995% which is about 26 minutes per year.

These impressive numbers can be achieved through in-region redundancy of the compute and storage resources and automatic failover within the region.

Some disruptive events may impact the region’s availability like Datacenter outage, possibly caused by a natural disaster.
To protect against a Region disaster, Auto-failover groups can be configured to automatically failover one or multiple databases to another region.

The technology behind Auto-failover group is the same as geo-replication but they are some differences.

In this blog post, we will configure an Auto-failover group for the database we previously created in a previous post.

So I start with the simple configuration of multiple Azure SQL Databases on a single server.

SSMS_Azure_Database

Create a new server in another region

First, we need to create our backup server in another Azure region, I will choose Japan East.

New-AzSqlServer -ResourceGroupName 'SQLFailover-lab-rg' `
    -ServerName 'snasqlsrv-lab-02' `
    -Location 'Japan East' `
    -SqlAdministratorCredentials $(New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList "LabSqlAdmin", $(ConvertTo-SecureString -String "MyPassword" -AsPlainText -Force))

I now have my second server ready with no databases except of master.

Get-AzResource | Where ResourceType -like '*Sql*' | ft

Name                    ResourceGroupName  ResourceType                    Location
----                    -----------------  ------------                    --------
snasqlsrv-lab-01        SQLFailover-lab-rg Microsoft.Sql/servers           francecentral
snasqlsrv-lab-01/DB01   SQLFailover-lab-rg Microsoft.Sql/servers/databases francecentral
snasqlsrv-lab-01/DB02   SQLFailover-lab-rg Microsoft.Sql/servers/databases francecentral
snasqlsrv-lab-01/master SQLFailover-lab-rg Microsoft.Sql/servers/databases francecentral
snasqlsrv-lab-02        SQLFailover-lab-rg Microsoft.Sql/servers           japaneast
snasqlsrv-lab-02/master SQLFailover-lab-rg Microsoft.Sql/servers/databases japaneast
Create the Auto-Failover group
New-AzSqlDatabaseFailoverGroup -ResourceGroupName 'SQLFailover-lab-rg' `
>>     -ServerName 'snasqlsrv-lab-01' -PartnerServerName 'snasqlsrv-lab-02' `
>>     -FailoverGroupName 'sqlfailover-lab-fg'

FailoverGroupName                    : sqlfailover-lab-fg
Location                             : France Central
ResourceGroupName                    : SQLFailover-lab-rg
ServerName                           : snasqlsrv-lab-01
PartnerLocation                      : Japan East
PartnerResourceGroupName             : SQLFailover-lab-rg
PartnerServerName                    : snasqlsrv-lab-02
ReplicationRole                      : Primary
ReplicationState                     : CATCH_UP
ReadWriteFailoverPolicy              : Automatic
FailoverWithDataLossGracePeriodHours : 1
DatabaseNames                        : {}

I now have an Automatic Failover group between my 2 regions.

Add databases to Auto-Failover group

There are no databases in the group yet. Let’s add them.

$database = Get-AzSqlDatabase -ResourceGroupName 'SQLFailover-lab-rg' -ServerName 'snasqlsrv-lab-01' -DatabaseName 'DB01'
Add-AzSqlDatabaseToFailoverGroup -ResourceGroupName 'SQLFailover-lab-rg' -ServerName 'snasqlsrv-lab-01' `
    -FailoverGroupName 'sqlfailover-lab-fg' -Database $database

The script above adds a single database to the Failover group. The script below will add all databases on the primary server.

$server01 = Get-AzSqlServer -ResourceGroupName 'SQLFailover-lab-rg' -ServerName 'snasqlsrv-lab-01'
$server01 | Add-AzSqlDatabaseToFailoverGroup -FailoverGroupName 'sqlfailover-lab-fg' -Database ($server01 | Get-AzSqlDatabase)

I can now see my two databases on my Failover group;

Get-AzSqlDatabaseFailoverGroup -FailoverGroupName 'sqlfailover-lab-fg' -ResourceGroupName 'SQLFailover-lab-rg' -ServerName 'snasqlsrv-lab-01'


FailoverGroupName                    : sqlfailover-lab-fg
Location                             : France Central
ResourceGroupName                    : SQLFailover-lab-rg
ServerName                           : snasqlsrv-lab-01
PartnerLocation                      : Japan East
PartnerResourceGroupName             : SQLFailover-lab-rg
PartnerServerName                    : snasqlsrv-lab-02
ReplicationRole                      : Primary
ReplicationState                     : CATCH_UP
ReadWriteFailoverPolicy              : Automatic
FailoverWithDataLossGracePeriodHours : 1
DatabaseNames                        : {DB01, DB02}

An overview of the group with a map is now available on Azure Portal.

AzurePortal_FG

Listeners

The Auto-Failover provides 2 listeners. The first one for read-write OLTP type workload. The second one for read-only connection.
The DNS records are updated automatically to redirect to the correct server after a role change.

  • Read-write: <failovergroup-name>.database.windows.net
  • Read-only: <failovergroup-name>.secondary.database.windows.net

Connecting to my Read-only listener I can indeed read data but not modify it.

Msg 3906, Level 16, State 2, Line 6
Failed to update database "DB01" because the database is read-only.
Check the status of the Auto-Failover group

As we have seen already we can look at the status in PowerShell with Get-Az

(Get-AzSqlDatabaseFailoverGroup `
   -FailoverGroupName 'sqlfailover-lab-fg' `
   -ResourceGroupName 'SQLFailover-lab-rg' `
   -ServerName 'snasqlsrv-lab-01').ReplicationRole

Some information about geo-replication are available in SQL DMVs. Here I use sys.geo_replication_links in the master database.

select d.name, @@SERVERNAME AS ServerName
	, replication_state_desc, role_desc, partner_server 
from sys.geo_replication_links AS grl
	join sys.databases AS d
		on grl.database_id = d.database_id

Manual Failover

A failover can be done manually using the Switch-AzSqlDatabaseFailoverGroup command.
Before doing it, let’s simply from SQL which server is now primary when I connect to my read-write listener;

My primary server is the 01. Let’s Failover.

Switch-AzSqlDatabaseFailoverGroup -ResourceGroupName 'SQLFailover-lab-rg' `
   -ServerName 'snasqlsrv-lab-02' `
   -FailoverGroupName 'sqlfailover-lab-fg'

After a few seconds, my read-write listener now redirects my connection to server 02.

 

In this blog post, we have seen how to configure an Auto-failover group for Azure SQL Database.
This is definitely something that would have been more complicated to do on-premise.

 

Cet article Introduction to Azure SQL Database Auto-failover groups est apparu en premier sur Blog dbi services.

How to create an Azure SQL Database using Azure PowerShell

Tue, 2020-06-02 00:56

In this blog post, I’ll go through the steps to create an Azure SQL Database using Azure PowerShell.

Introduction to Azure SQL Database

The SQL database services provided by Microsoft on the cloud are now grouped under the name of Azure SQL.

The Azure SQL family contains services that I will briefly summarize;

  • Azure SQL Database – DBaaS (Database-as-a-Service)
  • Azure SQL Managed Instance – PaaS (Platform-as-a-Service)
  • SQL Server on Azure VMs – IaaS (Infrastructure-as-a-Service)

In this blog post, I will use Azure SQL Database.

Azure SQL Database offers the following deployment options:

  • Single database – a fully-managed, isolated database
  • Elastic pool – a collection of single databases with a shared set of resources

I will not describe in detail this service but basically, it is a fully managed SQL database similar to a contained database in SQL Server.

All the steps below can be done on the Azure Portal. For this blog post, I’ll only use Azure PowerShell which you can install on your operating system or use online with Azure Cloud Shell.

1) Install and Import Az module

First, we need to install Azure PowerShell which provides a set of commands to manage your Azure resources from your favorite operating system; Windows, macOS, and Linux.

PS C:\> Install-Module Az
PS C:\> Get-InstalledModule -Name Az | select Name, Version

Name Version
---- -------
Az   4.1.0

PS C:\> Import-Module Az
2) Sign in to Azure

Connect to your Tenant using your Tenant ID.
You can find your Tenant ID in the Azure Portal under “Azure Active Directory”.Azure Active Directory

PS C:\> Connect-AzAccount -Tenant 'b9c70123-xxx-xxx-xxx-xxxx'

Account           SubscriptionName     TenantId                      Environment
-------           ----------------     --------                      -----------
my@Email.com      Visual Studio Ent    b9c70978-xxx-xxx-xxx-xxxx     AzureCloud

PS C:\>

Then, if you use multiple Azure subscriptions, select the one you want to work with.

PS C:\> Set-AzContext -SubscriptionId '891f5acc-xxx-xxx-xxx-xxxx'
3) Create a Resource Group

Let’s start with creating a Resource Group. A resource group is a container that holds related resources for an Azure solution.

PS C:\> New-AzResourceGroup -Name 'SQLFailover-lab-rg' -Location 'France Central'

ResourceGroupName : SQLFailover-lab-rg
Location          : francecentral
ProvisioningState : Succeeded
Tags              :
ResourceId        : /subscriptions/891f5acc-xxx-xxx-xxx-xxxx/resourceGroups/SQLFailover-lab-rg

To list all your Resource Groups use the Get-AzResourceGroup command:

Get-AzResourceGroup | select ResourceGroupName
4) Create an SQL Server

Create a logical server with a unique server name to host our SQL databases.

New-AzSqlServer -ResourceGroupName 'SQLFailover-lab-rg' `
    -ServerName 'snasqlsrv-lab-01' `
    -Location 'France Central' `
    -SqlAdministratorCredentials $(New-Object -TypeName System.Management.Automation.PSCredential -ArgumentList "LabSqlAdmin", $(ConvertTo-SecureString -String "MyPassword" -AsPlainText -Force))

The last parameter defines the credentials I will use to connect as an administrator to my SQL Database server.
Once the server is created you get the FQDN that will be used for connections.

PS C:\> Get-AzSqlServer | select FullyQualifiedDomainName

FullyQualifiedDomainName
------------------------
snasqlsrv-lab-01.database.windows.net
5) Create a Server Firewall Rule

To access the server and all the databases from my client computer I need to create a server firewall rule.
Here I use a WebRequest to get my public IP into a variable and then create the server firewall rule.

$myIp = (Invoke-WebRequest ifconfig.me/ip).Content
New-AzSqlServerFirewallRule -ResourceGroupName 'SQLFailover-lab-rg' `
    -ServerName 'snasqlsrv-lab-01' `
    -FirewallRuleName "AllowedIPs" -StartIpAddress $myIp -EndIpAddress $myIp
6) Connect to the SQL Server from SSMS

The SQL Server is now accessible from my computer client on port 1433. I can connect to it using SSMS.

ConnectSSMS
SSMS

7) Create a database

The following command will create a database named “DB01” with an S0 performance level and using the sample schema “AventureWorksLT”.

New-AzSqlDatabase  -ResourceGroupName 'SQLFailover-lab-rg' `
    -ServerName 'snasqlsrv-lab-01' `
    -DatabaseName 'DB01' `
    -RequestedServiceObjectiveName "S0" `
    -SampleName "AdventureWorksLT"

This is it. We just a created an Azure SQL Database with a few commands.

Bonus: Creating a Copy of the database

I just want to mention a nice T-SQL command with Azure SQL Database that doesn’t exist on-premise: “CREATE DATABASE AS A COPY”.
This command creates a copy of a database with a new name. This replace the backup/”restore with move” that we do sometimes on SQL Server.

Cleanup

When you’re done with your tests you can delete all resources in the resource group (firewall rules, server, databases) with a single command;

PS C:\> Remove-AzResourceGroup -ResourceGroupName 'SQLFailover-lab-rg'

 

 

Cet article How to create an Azure SQL Database using Azure PowerShell est apparu en premier sur Blog dbi services.

Oracle 20c : Create a Far Sync Instance Is Now Easy

Sat, 2020-05-30 14:07

A far sync instance is like a standby instance as it can receive redo from the primary database and can ship that redo to other members of the Data Guard configuration. But unlike a physical standby instance, a far sync instance does not contain any datafiles and then can not be open for access. A far sync instance just manages a controlfile. A far sync instance cannot be converted to a primary instance or any other type of standby
Far sync instances are part of the Oracle Active Data Guard Far Sync feature, which requires an Oracle Active Data Guard license.
Until Oracle 20c, the creation of a far sync install was manual. Until Oracle 20c the far sync install must be manually added to the broker.

Starting with Oracke 20c, Oracle now can create a far sync instance for us and also add it in the broker configuration.

In this blog I am showing how to use this functionnality. Below the actual configuration I am using

DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 55 seconds ago)

DGMGRL>

And I am going to create a far sync instance fs_site3 to receive changes from the primary database prod20_site20 and to ship these changes to prod20_site4 as shown in this figure

With Oracle there is a new CREATE FAR_SYNC command whichh will create the far sync instance for us. But before using this command there are some steps.
First we have to configure Secure External Password Store for the netalias we use.

In our case we are using following aliases
prod20_site1
prod20_site2
prod20_site3
prod20_site4

oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site1
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site1_dgmgrl)))
OK (0 msec)
oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site2
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver2)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site2_dgmgrl)))
oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site3
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver3)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = fs_site3_dgmgrl)))
OK (0 msec)
oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)] tnsping prod20_site4
…
…
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = oraadserver4)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = prod20_site4_dgmgrl)))
OK (0 msec)
oracle@oraadserver:/home/oracle/ [prod20 (CDB$ROOT)]

Basically to configure Secure External Password Store

mkstore -wrl wallet_location -create
mkstore -wrl wallet_location -createCredential prod20_site1 sys 
mkstore -wrl wallet_location -createCredential prod20_site2 sys 
mkstore -wrl wallet_location -createCredential prod20_site3  sys 
…

And after you will have to update your sqlnet.ora file with the location of the wallet.

If everything is OK, you normally should be able to connect using your tnsalias

CDB$ROOT)] sqlplus /@prod20_site1 as sysdba

SQL*Plus: Release 20.0.0.0.0 - Production on Sat May 30 19:20:21 2020
Version 20.2.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Heure de la derniere connexion reussie : Sam. Mai   30 2020 18:36:15 +02:00

Connecte a :
Oracle Database 20c Enterprise Edition Release 20.0.0.0.0 - Production
Version 20.2.0.0.0

SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 prod20_site1
SQL>

After I have start instance fs_site3 in a no mount mode

SQL> startup nomount
ORACLE instance started.

Total System Global Area  314570960 bytes
Fixed Size		    9566416 bytes
Variable Size		  188743680 bytes
Database Buffers	  113246208 bytes
Redo Buffers		    3014656 bytes
SQL> show parameter db_unique_name

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
db_unique_name			     string	 FS_SITE3
SQL>

And then connection to the broker I can use the command CREATE FAR_SYNC

oracle@oraadserver:/u01/ [prod20 (CDB$ROOT)] dgmgrl
DGMGRL for Linux: Release 20.0.0.0.0 - Production on Sat May 30 19:25:31 2020
Version 20.2.0.0.0

Copyright (c) 1982, 2020, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected to "prod20_site1"
Connected as SYSDG.
DGMGRL> CREATE FAR_SYNC fs_site3 AS CONNECT IDENTIFIER IS "prod20_site3";
Creating far sync instance "fs_site3".
Connected to "prod20_site1"
Connected to "FS_SITE3"
far sync instance "fs_site3" created
far sync instance "fs_site3" added
DGMGRL>

The far sync instance is created and added in the configuration as we can verify

DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database
    fs_site3     - Far sync instance (disabled)
      ORA-16905: The member was not enabled yet.

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 31 seconds ago)

DGMGRL>

Let’s enable the far sync instance

DGMGRL> enable far_sync fs_site3;
Enabled.
DGMGRL> show configuration

Configuration - prod20

  Protection Mode: MaxAvailability
  Members:
  prod20_site1 - Primary database
    prod20_site2 - Physical standby database
    prod20_site4 - Physical standby database
    fs_site3     - Far sync instance

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 38 seconds ago)

DGMGRL>

Now that the far sync is created, we can configure the redoroutes for the databases.
The following configuration means
-If prod20_site1 is the primary database, it will send the changes to prod20_site2 and to fs_site3
-And the fs_site3 will send the changes to prod20_site4 if prod20_site1 is the primary database

DGMGRL> edit database prod20_site1 set property redoroutes='(local:prod20_site2,fs_site3)';
Property "redoroutes" updated

DGMGRL> edit far_sync fs_site3 set property redoroutes='(prod20_site1:prod20_site4 ASYNC)';
Property "redoroutes" updated

We will talk in deep in redoroutes configuration in coming blogs

Cet article Oracle 20c : Create a Far Sync Instance Is Now Easy est apparu en premier sur Blog dbi services.

Patching Oracle Database Appliance From 18.8 to 19.6

Sat, 2020-05-30 07:27

The ODA software 19.6 is released and people are starting to patch.
A direct patch to version 19.6 is possible from version 18.8.
Before patching your deployment to Oracle Database Appliance release 19.6, you must upgrade the operating system to Oracle Linux 7.
In this blog I am describing the steps I follow when patching an ODA from 18.8 to 19.6. I am using an ODA X7-2 (one node)
The first step is of course to download the required patch. You will need to copy following patch to your ODA.
Patch 31010832 : ORACLE DATABASE APPLIANCE 19.6.0.0.0 SERVER PATCH FOR ALL ODACLI/DCS STACK
Patch 30403662 : ORACLE DATABASE APPLIANCE RDBMS CLONE FOR ODACLI/DCS STACK

The first patch 31010832 will be used to upgrade the OS to Linux 7 and to patch you server to Appliance 19.6
The patch 30403662 will install the Oracle 19c clone database

The first patch contains 4 files

p31010832_196000_Linux-x86-64_1of4.zip
p31010832_196000_Linux-x86-64_2of4.zip
p31010832_196000_Linux-x86-64_3of4.zip
p31010832_196000_Linux-x86-64_4of4.zip

Just use the unzip command to decompress

oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_1of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_2of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_3of4.zip
oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p31010832_196000_Linux-x86-64_4of4.zip

Once done we have following files that we will use to update the repository

oda-sm-19.6.0.0.0-200420-server1of4.zip
oda-sm-19.6.0.0.0-200420-server2of4.zip
oda-sm-19.6.0.0.0-200420-server3of4.zip
oda-sm-19.6.0.0.0-200420-server4of4.zip

After you will have to list your scheduled jobs with the list-schedules option

[root@oda-serveru01]# odacli list-schedules

ID                                       Name                      Description                                        CronExpression                 Disabled
---------------------------------------- ------------------------- -------------------------------------------------- ------------------------------ --------
6d9cd445-8890-4bd6-a713-f6eb8fce35d0     metastore maintenance     internal metastore maintenance                     0 0 0 1/1 * ? *                true   
113ea801-636c-45d9-ad70-448054d825d5     AgentState metastore cleanup internal agentstateentry metastore maintenance     0 0 0 1/1 * ? *                false
e1780bf3-2467-4a89-9298-857fed7fa101     bom maintenance           bom reports generation                             0 0 1 ? * SUN *                false  
2f39ed40-6fc9-4547-b9f1-11889c5c7df9     Big File Upload Cleanup   clean up expired big file uploads.                 0 0 1 ? * SUN *                false  
a624a5c4-4801-444b-9b48-8be99f9f9e48     feature_tracking_job      Feature tracking job                               0 0 20 ? * WED *               false  

And disable all enabled jobs (column disabled to false). This done by using the command update-server with -d for disable and -i for the job_id

[root@oda-server~]# odacli update-schedule -d -i  113ea801-636c-45d9-ad70-448054d825d5
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  e1780bf3-2467-4a89-9298-857fed7fa101
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  2f39ed40-6fc9-4547-b9f1-11889c5c7df9
Update job schedule success
[root@oda-server~]# odacli  update-schedule -d -i  a624a5c4-4801-444b-9b48-8be99f9f9e48
Update job schedule success
[root@oda-server~]#

When listing the jobs again, the column disabled must return true for all jobs

odabr is required. for the patching to 19.6. odabr is a tool to backup and recover an ODA. When running the precheck. the result will return failed if odabr is not installed. Please you can consult following document for downloading and using odabr : ODA (Oracle Database Appliance): ODABR a System Backup/Restore Utility (Doc ID 2466177.1)
By default odabr requires 190G free space in the LVM which can be not the case sometimes. In such case odabr should be run using some specific options. In my case I used the command below to take a backup of ODA.

[root@oda-server~]# /opt/odabr/odabr backup -snap -osize 40 -rsize 20 -usize 80
INFO: 2020-05-29 09:31:15: Please check the logfile '/opt/odabr/out/log/odabr_81097.log' for more details


│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
 odabr - ODA node Backup Restore - Version: 2.0.1-55
 Copyright Oracle, Inc. 2013, 2020
 --------------------------------------------------------
 Author: Ruggero Citton 
 RAC Pack, Cloud Innovation and Solution Engineering Team
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│

INFO: 2020-05-29 09:31:15: Checking superuser
INFO: 2020-05-29 09:31:15: Checking Bare Metal
INFO: 2020-05-29 09:31:15: Removing existing LVM snapshots
WARNING: 2020-05-29 09:31:15: LVM snapshot for 'opt' does not exist
WARNING: 2020-05-29 09:31:15: LVM snapshot for 'u01' does not exist
WARNING: 2020-05-29 09:31:16: LVM snapshot for 'root' does not exist
INFO: 2020-05-29 09:31:16: Checking LVM size
INFO: 2020-05-29 09:31:16: Boot device backup
INFO: 2020-05-29 09:31:16: ...getting boot device
INFO: 2020-05-29 09:31:16: ...making boot device backup
INFO: 2020-05-29 09:31:20: ...boot device backup saved as '/opt/odabr/out/hbi/boot.img'
INFO: 2020-05-29 09:31:21: ...boot device backup check passed
INFO: 2020-05-29 09:31:21: Getting EFI device
INFO: 2020-05-29 09:31:21: ...making efi device backup
INFO: 2020-05-29 09:31:24: ...EFI device backup saved as '/opt/odabr/out/hbi/efi.img'
INFO: 2020-05-29 09:31:24: ...EFI device backup check passed
INFO: 2020-05-29 09:31:24: OCR backup
INFO: 2020-05-29 09:31:26: ...ocr backup saved as '/opt/odabr/out/hbi/ocrbackup_81097.bck'
INFO: 2020-05-29 09:31:26: Making LVM snapshot backup
SUCCESS: 2020-05-29 09:31:28: ...snapshot backup for 'opt' created successfully
SUCCESS: 2020-05-29 09:31:29: ...snapshot backup for 'u01' created successfully
SUCCESS: 2020-05-29 09:31:29: ...snapshot backup for 'root' created successfully
SUCCESS: 2020-05-29 09:31:29: LVM snapshots backup done successfully
[root@oda-server~]#

Snapshots can be verified using the infosnap option

[root@oda-server~]# /opt/odabr/odabr infosnap

│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│
 odabr - ODA node Backup Restore - Version: 2.0.1-55
 Copyright Oracle, Inc. 2013, 2020
 --------------------------------------------------------
 Author: Ruggero Citton 
 RAC Pack, Cloud Innovation and Solution Engineering Team
│▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│


LVM snap name         Status                COW Size              Data%
-------------         ----------            ----------            ------
root_snap             active                20.00 GiB             0.02%
opt_snap              active                40.00 GiB             0.21%
u01_snap              active                80.00 GiB             0.02%


You have new mail in /var/spool/mail/root
[root@oda-server~]#

We can now update the repository with the server software using the patch 31010832

For the first file

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip

{
  "jobId" : "e880e05e-e65a-44ea-897e-2ff376b28066",
  "status" : "Created",
  "message" : "/u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 10:11:11 AM CEST",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "May 29, 2020 10:11:11 AM CEST"
}
[root@oda-server~]#

Check the job status with describe-job

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "e880e05e-e65a-44ea-897e-2ff376b28066"

Job details
----------------------------------------------------------------
                     ID:  e880e05e-e65a-44ea-897e-2ff376b28066
            Description:  Repository Update
                 Status:  Success
                Created:  May 29, 2020 10:11:11 AM CEST
                Message:  /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server1of4.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

Do this for all 3 other files. All jobs must return success.

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server2of4.zip
[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server3of4.zip
[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/oda-sm-19.6.0.0.0-200420-server4of4.zip

Now let’s update the DCS agent

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-dcsagent -v 19.6.0.0.0
{
  "jobId" : "2a2de2bc-d0c4-466a-91a8-818089d18867",
  "status" : "Created",
  "message" : "Dcs agent will be restarted after the update. Please wait for 2-3 mins before executing the other commands",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 10:23:38 AM CEST",
  "resourceList" : [ ],
  "description" : "DcsAgent patching",
  "updatedTime" : "May 29, 2020 10:23:38 AM CEST"
}
[root@oda-server19.6]#

Confirm that the job returns success

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "2a2de2bc-d0c4-466a-91a8-818089d18867"

Job details
----------------------------------------------------------------
                     ID:  2a2de2bc-d0c4-466a-91a8-818089d18867
            Description:  DcsAgent patching
                 Status:  Success
                Created:  May 29, 2020 10:23:38 AM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
dcs-agent upgrade  to version 19.6.0.0.0 May 29, 2020 10:23:38 AM CEST       May 29, 2020 10:25:17 AM CEST       Success
Update System version                    May 29, 2020 10:25:18 AM CEST       May 29, 2020 10:25:18 AM CEST       Success

[root@oda-server19.6]#

Before upgrading the OS, we must generate a prepatch report. This will indicate eventual issues that may cause the patching to fail.

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli create-prepatchreport -v 19.6.0.0.0 -os

Job details
----------------------------------------------------------------
                     ID:  d0a038e3-651d-478f-87a8-6cb1e125e437
            Description:  Patch pre-checks for [OS]
                 Status:  Created
                Created:  May 29, 2020 10:29:55 AM CEST
                Message:  Use 'odacli describe-prepatchreport -i d0a038e3-651d-478f-87a8-6cb1e125e437' to check details of results

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

We can monitor the status of the job with the describe-job command. As we can see we have failure

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i d0a038e3-651d-478f-87a8-6cb1e125e437

Job details
----------------------------------------------------------------
                     ID:  d0a038e3-651d-478f-87a8-6cb1e125e437
            Description:  Patch pre-checks for [OS]
                 Status:  Failure
                Created:  May 29, 2020 10:29:55 AM CEST
                Message:  DCS-10001:Internal error encountered: One or more pre-checks failed. Run describe-prepatchreport for more details.

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
task:TaskZLockWrapper_132                May 29, 2020 10:30:01 AM CEST       May 29, 2020 10:40:02 AM CEST       Failure
Run patching pre-checks                  May 29, 2020 10:30:01 AM CEST       May 29, 2020 10:40:02 AM CEST       Success
Check pre-check status                   May 29, 2020 10:40:02 AM CEST       May 29, 2020 10:40:02 AM CEST       Failure

[root@oda-server19.6]#

To have more info about these errors, we use the describe-prepatchreport

[root@server-oda 19.6]# /opt/oracle/dcs/bin/odacli describe-prepatchreport  -i d0a038e3-651d-478f-87a8-6cb1e125e437
 
Patch pre-check report
------------------------------------------------------------------------
                 Job ID:  d0a038e3-651d-478f-87a8-6cb1e125e437
            Description:  Patch pre-checks for [OS]
                 Status:  FAILED
                Created:  May 29, 2020 10:29:55 AM CEST
                 Result:  One or more pre-checks failed for [OS]
 
Node Name
---------------
server-oda
 
Pre-Check                      Status   Comments
------------------------------ -------- --------------------------------------
__OS__
Validate supported versions     Success   Validated minimum supported versions.
Validate patching tag           Success   Validated patching tag: 19.6.0.0.0.
Is patch location available     Success   Patch location is available.
Validate if ODABR is installed  Success   Validated ODABR is installed
Validate if ODABR snapshots     Failed    ODABR snapshots are seen on node:
exist                                     server-oda.
Validate LVM free space         Failed    Insufficient space to create LVM
                                          snapshots on node: server-oda. Expected
                                          free space (GB): 190, available space
                                        (GB): 22.
Space checks for OS upgrade     Success   Validated space checks.
Install OS upgrade software     Success   Extracted OS upgrade patches into
                                          /root/oda-upgrade. Do not remove this
                                          directory untill OS upgrade completes.
Verify OS upgrade by running    Success   Results stored in:
preupgrade checks                         '/root/preupgrade-results/
                                          preupg_results-200529103957.tar.gz' .
                                          Read complete report file
                                          '/root/preupgrade/result.html' before
                                          attempting OS upgrade.
Validate custom rpms installed  Failed    Custom RPMs installed. Please check
                                          files
                                          /root/oda-upgrade/
                                          rpms-added-from-ThirdParty and/or
                                          /root/oda-upgrade/
                                          rpms-added-from-Oracle.
Scheduled jobs check            Success    Scheduled jobs found. Disable
                                          scheduled jobs before attempting OS
                                          upgrade.
 
 
[root@server-oda 19.6]#

As we can see the prepatch reports two errors
One related to odabr because I have already taken a backup for my ODA
A second error related to custom rpms installed.

For the error related to odabr as it is mentioned in the documentation

If snapshots are already present on the system when odacli create-prepatchreport is run, this precheck fails, because ODACLI expects to create these snapshots itself. If the user created snapshots or the operating system upgrade was retried (due to a failure) after it had already created the snapshots, this precheck will fail. Note that if snapshots already exist, odacli update-server –c OS still continues with the upgrade.

I decided to ignore the error and to run later the patch with option –force

But be sure before continuing to check these following two files and to remove all mentioned rpm packages.

/root/oda-upgrade/rpms-added-from-Oracle
/root/oda-upgrade/ rpms-added-from-ThirdParty

Just another thing os to verify if you have enough free space. You can if needed cleanup your old repository

[root@server-oda 19.6]# odacli cleanup-patchrepo -comp GI,DB -v 18.8.0.0.0

In my case below is the status of my /, /u01 and /opt filesystems.

[root@oda-server~]# df -h / /u01 /opt
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
                       30G   11G   18G  39% /
/dev/mapper/VolGroupSys-LogVolU01
                      148G   96G   45G  69% /u01
/dev/mapper/VolGroupSys-LogVolOpt
                       63G   40G   21G  66% /opt
[root@oda-server~]#

It’s time now to start the patching process by upgrading the server to Linux 7. I did the following command from the ILOM console as described in the documentation.

With putty connect to your ILOM IP with root. Once connect to the ILOM just type start /SP/console and log in as root on your server.

-> start /SP/console
Are you sure you want to start /SP/console (y/n)? y

Serial console started.  To stop, type ESC (


Oracle Linux Server release 6.10
Kernel 4.1.12-124.33.4.el6uek.x86_64 on an x86_64

oda-serverlogin: root
Password:
Last login: Fri May 29 08:28:34 from 10.14.211.131
[root@oda-server~]#

And with root from the ILOM console run the OS upgrade (Just remember in my case I decided to use the –force)

[root@oda-server~]# odacli update-server -v 19.6.0.0.0 -c os --local --force
Verifying OS upgrade
Current OS base version: 6 is lessthan target OS base version: 7
OS needs to upgrade to 7.7
****************************************************************************
*  Depending on the hardware platform, the upgrade operation, including    *
*  node reboot, may take 30-60 minutes to complete. Individual steps in    *
*  the operation may not show progress messages for a while. Do not abort  *
*  upgrade using ctrl-c or by rebooting the system.                        *
****************************************************************************

run: cmd= '[/usr/bin/expect, /root/upgradeos.exp]'
output redirected to /root/odaUpgrade2020-05-29_13-50-23.0388.log

Running pre-upgrade checks.

Running pre-upgrade checks.
........
Running assessment of the system
.........

This will take some time, but just wait. You may see the progression from the ILOM console


…
[  112.602728] upgrade[6905]: [505/779] (51%) installing libsmbclient-4.9.1-10.el7_7...
[  112.647750] upgrade[6905]: [506/779] (51%) installing mesa-libgbm-18.3.4-6.el7_7...
[  112.691582] upgrade[6905]: [507/779] (51%) installing device-mapper-event-1.02.158-2.0.1.el7_7.2...
[  112.719106] upgrade[6905]: Created symlink /etc/systemd/system/sockets.target.wants/dm-event.socket, pointing to /usr/lib/systemd/system/dm-event.socket.
[  112.736417] upgrade[6905]: Running in chroot, ignoring request.
[  112.745205] upgrade[6905]: [508/779] (51%) installing usbredir-0.7.1-3.el7...
[  112.780151] upgrade[6905]: [509/779] (51%) installing uptrack-offline-1.2.62.offline-0.el7...
[  113.001966] upgrade[6905]: [510/779] (51%) installing dconf-0.28.0-4.el7...
[  113.129938] upgrade[6905]: [511/779] (51%) installing iscsi-initiator-utils-iscsiuio-6.2.0.874-11.0.1.el7...
[  113.178038] upgrade[6905]: [512/779] (51%) installing iscsi-initiator-utils-6.2.0.874-11.0.1.el7...
[  113.362242] upgrade[6905]: [513/779] (52%) installing unbound-libs-1.6.6-1.el7...

At the end of the process the server will reboot. And if everything is fine you should have an Linux 7.7 now

[root@oda-server~]# cat /etc/oracle-release
Oracle Linux Server release 7.7
[root@oda-server~]#

After the operating system upgrade is completed successfully, run the post upgrade checks:

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server-postcheck -v 19.6.0.0.0

Upgrade post-check report
-------------------------

Node Name
---------------
server-oda

Comp  Pre-Check                      Status   Comments
----- ------------------------------ -------- --------------------------------------
OS    OS upgrade check               SUCCESS  OS has been upgraded to OL7
GI    GI upgrade check               INFO     GI home needs to update to 19.6.0.0.200114
GI    GI status check                SUCCESS  Clusterware is running on the node
OS    ODABR snapshot                 WARNING  ODABR snapshot found. Run 'odabr delsnap' to delete.
RPM   Extra RPM check                SUCCESS  No extra RPMs found when OS was at OL6
[root@oda-server~]#

As specified let’s remove the snapshots we took with odabr

[root@oda-server~]# /opt/odabr/odabr delsnap
INFO: 2020-05-29 14:22:31: Please check the logfile '/opt/odabr/out/log/odabr_60717.log' for more details

INFO: 2020-05-29 14:22:31: Removing LVM snapshots
INFO: 2020-05-29 14:22:31: ...removing LVM snapshot for 'opt'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'opt' removed successfully
INFO: 2020-05-29 14:22:32: ...removing LVM snapshot for 'u01'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'u01' removed successfully
INFO: 2020-05-29 14:22:32: ...removing LVM snapshot for 'root'
SUCCESS: 2020-05-29 14:22:32: ...snapshot for 'root' removed successfully
SUCCESS: 2020-05-29 14:22:32: Remove LVM snapshots done successfully

Running the posthecks again

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server-postcheck -v 19.6.0.0.0

Upgrade post-check report
-------------------------

Node Name
---------------
server-oda

Comp  Pre-Check                      Status   Comments
----- ------------------------------ -------- --------------------------------------
OS    OS upgrade check               SUCCESS  OS has been upgraded to OL7
GI    GI upgrade check               INFO     GI home needs to update to 19.6.0.0.200114
GI    GI status check                SUCCESS  Clusterware is running on the node
OS    ODABR snapshot                 SUCCESS  No ODABR snapshots found
RPM   Extra RPM check                SUCCESS  No extra RPMs found when OS was at OL6
[root@oda-server~]#

After the OS upgrade now let’s upgrade the remaining components
We start by updating the DSC agent

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-dcsadmin -v 19.6.0.0.0
{
  "jobId" : "5bdb8d99-d3ae-421c-a05d-60b9ece94021",
  "status" : "Created",
  "message" : null,
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 14:26:32 PM CEST",
  "resourceList" : [ ],
  "description" : "DcsAdmin patching",
  "updatedTime" : "May 29, 2020 14:26:32 PM CEST"
}
[root@oda-server~]#

We validate that the Job returns success

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "5bdb8d99-d3ae-421c-a05d-60b9ece94021"

Job details
----------------------------------------------------------------
                     ID:  5bdb8d99-d3ae-421c-a05d-60b9ece94021
            Description:  DcsAdmin patching
                 Status:  Success
                Created:  May 29, 2020 2:26:32 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success
dcsadmin upgrade                         May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success
Update System version                    May 29, 2020 2:26:32 PM CEST        May 29, 2020 2:26:32 PM CEST        Success

[root@oda-server~]#

We update the DCS components

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-dcscomponents -v 19.6.0.0.0
{
  "jobId" : "4af0626e-d260-4f93-8948-2f241f7b2c48",
  "status" : "Success",
  "message" : null,
  "reports" : null,
  "createTimestamp" : "May 29, 2020 14:27:52 PM CEST",
  "description" : "Job completed and is not part of Agent job list",
  "updatedTime" : "May 29, 2020 14:27:52 PM CEST"
}
[root@oda-server~]#

And then we update the server

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-server -v 19.6.0.0.0
{
  "jobId" : "91f7b206-b097-4ba2-bae7-42a6b3bb0ba6",
  "status" : "Created",
  "message" : "Success of server update will trigger reboot of the node after 4-5 minutes. Please wait until the node reboots.",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 14:31:25 PM CEST",
  "resourceList" : [ ],
  "description" : "Server Patching",
  "updatedTime" : "May 29, 2020 14:31:25 PM CEST"
}
[root@oda-server~]#

This will lasts about 45 minutes. Validate that all is fine

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "91f7b206-b097-4ba2-bae7-42a6b3bb0ba6"

Job details
----------------------------------------------------------------
                     ID:  91f7b206-b097-4ba2-bae7-42a6b3bb0ba6
            Description:  Server Patching
                 Status:  Success
                Created:  May 29, 2020 2:31:25 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Patch location validation                May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:33 PM CEST        Success
dcs-controller upgrade                   May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:33 PM CEST        Success
Creating repositories using yum          May 29, 2020 2:31:33 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
Applying HMP Patches                     May 29, 2020 2:31:34 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
Patch location validation                May 29, 2020 2:31:34 PM CEST        May 29, 2020 2:31:34 PM CEST        Success
oda-hw-mgmt upgrade                      May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:35 PM CEST        Success
OSS Patching                             May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:35 PM CEST        Success
Applying Firmware Disk Patches           May 29, 2020 2:31:35 PM CEST        May 29, 2020 2:31:40 PM CEST        Success
Applying Firmware Expander Patches       May 29, 2020 2:31:40 PM CEST        May 29, 2020 2:31:44 PM CEST        Success
Applying Firmware Controller Patches     May 29, 2020 2:31:44 PM CEST        May 29, 2020 2:31:48 PM CEST        Success
Checking Ilom patch Version              May 29, 2020 2:31:50 PM CEST        May 29, 2020 2:31:52 PM CEST        Success
Patch location validation                May 29, 2020 2:31:52 PM CEST        May 29, 2020 2:31:53 PM CEST        Success
Save password in Wallet                  May 29, 2020 2:31:54 PM CEST        May 29, 2020 2:31:54 PM CEST        Success
Apply Ilom patch                         May 29, 2020 2:31:54 PM CEST        May 29, 2020 2:39:55 PM CEST        Success
Copying Flash Bios to Temp location      May 29, 2020 2:39:55 PM CEST        May 29, 2020 2:39:55 PM CEST        Success
Starting the clusterware                 May 29, 2020 2:39:55 PM CEST        May 29, 2020 2:41:52 PM CEST        Success
Creating GI home directories             May 29, 2020 2:41:53 PM CEST        May 29, 2020 2:41:53 PM CEST        Success
Cloning Gi home                          May 29, 2020 2:41:53 PM CEST        May 29, 2020 2:44:16 PM CEST        Success
Configuring GI                           May 29, 2020 2:44:16 PM CEST        May 29, 2020 2:46:25 PM CEST        Success
Running GI upgrade root scripts          May 29, 2020 2:46:25 PM CEST        May 29, 2020 3:03:15 PM CEST        Success
Resetting DG compatibility               May 29, 2020 3:03:15 PM CEST        May 29, 2020 3:03:33 PM CEST        Success
Running GI config assistants             May 29, 2020 3:03:33 PM CEST        May 29, 2020 3:04:18 PM CEST        Success
restart oakd                             May 29, 2020 3:04:28 PM CEST        May 29, 2020 3:04:39 PM CEST        Success
Updating GiHome version                  May 29, 2020 3:04:39 PM CEST        May 29, 2020 3:04:45 PM CEST        Success
Update System version                    May 29, 2020 3:04:54 PM CEST        May 29, 2020 3:04:54 PM CEST        Success
preRebootNode Actions                    May 29, 2020 3:04:54 PM CEST        May 29, 2020 3:05:40 PM CEST        Success
Reboot Ilom                              May 29, 2020 3:05:40 PM CEST        May 29, 2020 3:05:40 PM CEST        Success

[root@oda-server~]#

And we can verify that the components were upgraded

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-component
System Version
---------------
19.6.0.0.0

Component                                Installed Version    Available Version
---------------------------------------- -------------------- --------------------
OAK                                       19.6.0.0.0            up-to-date
GI                                        19.6.0.0.200114       up-to-date
DB                                        11.2.0.4.190115       11.2.0.4.200114
DCSAGENT                                  19.6.0.0.0            up-to-date
ILOM                                      4.0.4.52.r133103      up-to-date
BIOS                                      41060700              up-to-date
OS                                        7.7                   up-to-date
FIRMWARECONTROLLER                        QDV1RF30              up-to-date
FIRMWAREDISK                              0121                  up-to-date
HMP                                       2.4.5.0.1             up-to-date

[root@oda-server~]#

After the server we have to update the storage

[root@oda-server~]# /opt/oracle/dcs/bin/odacli update-storage -v 19.6.0.0.0
{
  "jobId" : "c4b365ff-bd8c-4bca-b53d-d9d9bda90548",
  "status" : "Created",
  "message" : "Success of Storage Update may trigger reboot of node after 4-5 minutes. Please wait till node restart",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 15:16:17 PM CEST",
  "resourceList" : [ ],
  "description" : "Storage Firmware Patching",
  "updatedTime" : "May 29, 2020 15:16:17 PM CEST"
}
[root@oda-server~]#

We can verify that the job was successful

[root@oda-server~]# /opt/oracle/dcs/bin/odacli describe-job -i "c4b365ff-bd8c-4bca-b53d-d9d9bda90548"

Job details
----------------------------------------------------------------
                     ID:  c4b365ff-bd8c-4bca-b53d-d9d9bda90548
            Description:  Storage Firmware Patching
                 Status:  Success
                Created:  May 29, 2020 3:16:17 PM CEST
                Message:

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Applying Firmware Disk Patches           May 29, 2020 3:16:18 PM CEST        May 29, 2020 3:16:21 PM CEST        Success
Applying Firmware Controller Patches     May 29, 2020 3:16:21 PM CEST        May 29, 2020 3:16:26 PM CEST        Success
preRebootNode Actions                    May 29, 2020 3:16:26 PM CEST        May 29, 2020 3:16:26 PM CEST        Success
Reboot Ilom                              May 29, 2020 3:16:26 PM CEST        May 29, 2020 3:16:26 PM CEST        Success

[root@oda-server~]#

I did not patch the existing database home because I just want to keep the actual version for my existing databases.

To be able to create new 19c databases we have to update the repository with corresponding rdbms clone.
After unzipping the archive

oracle@server-oda:/u01/oda_patch_mdi/19.6/ [ORCL] unzip p30403662_196000_Linux-x86-64.zip
Archive:  p30403662_196000_Linux-x86-64.zip
 extracting: odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip

I just have to update the repository with the db clone

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli update-repository -f /u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip
{
  "jobId" : "e8e58467-460f-4b07-86db-d4b71f1e5884",
  "status" : "Created",
  "message" : "/u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip",
  "reports" : [ ],
  "createTimestamp" : "May 29, 2020 15:23:37 PM CEST",
  "resourceList" : [ ],
  "description" : "Repository Update",
  "updatedTime" : "May 29, 2020 15:23:37 PM CEST"
}
[root@oda-server19.6]#

Then I check that the job was fine

[root@oda-server19.6]# /opt/oracle/dcs/bin/odacli describe-job -i "e8e58467-460f-4b07-86db-d4b71f1e5884"

Job details
----------------------------------------------------------------
                     ID:  e8e58467-460f-4b07-86db-d4b71f1e5884
            Description:  Repository Update
                 Status:  Success
                Created:  May 29, 2020 3:23:37 PM CEST
                Message:  /u01/oda_patch_mdi/19.6/odacli-dcs-19.6.0.0.0-200326-DB-19.6.0.0.zip

Task Name                                Start Time                          End Time                            Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------

[root@oda-server19.6]#

Encountered ACFS Issue
During the patch I did not have any issue. All was fine. But at the end of the patch my existing databases did not come up because of an acfs issue.
Indeed, the ASM instance was up but the ASM proxy instance APX was instable

[grid@oda-server~]$ srvctl start  asm -proxy -node oda-server
PRCR-1013 : Failed to start resource ora.proxy_advm
PRCR-1064 : Failed to start resource ora.proxy_advm on node oda-server
CRS-5017: The resource action "ora.proxy_advm start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/oda-server/crs/trace/crsd_oraagent_grid.trc".

Looking on the trace file

[root@oda-servertrace]# grep -i ORA- /u01/app/grid/diag/crs/oda-server/crs/trace/crsd_oraagent_grid.trc
2020-05-29 16:51:18.216 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:51:18.216 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:0 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:51:39.641 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:51:39.641 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:1 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:52:01.078 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-03113: end-of-file on communication channel
2020-05-29 16:52:01.078 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 250 ORA-3113retryCount:2 m_instanceType:2 m_lastOCIError:3113
2020-05-29 16:52:01.098 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-01092: ORACLE instance terminated. Disconnection forced
2020-05-29 16:52:01.099 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] InstAgent::startInstance 380 throw excp what:ORA-01092: ORACLE instance terminated. Disconnection forced
2020-05-29 16:52:01.099 :CLSDYNAM:916408064: [ora.proxy_advm]{1:38257:2942} [start] ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-01092: ORACLE instance terminated. Disconnection forced
[root@oda-servertrace]#

For some reason the acfs module was not loaded. My databases was Oracle 11.2.0.4 and were using acfs.

[root@oda-server~]# /sbin/lsmod | grep oracle
oracleoks             724992  0
oracleafd             229376  0

So I first stop my crs

[root@oda-server~]#  crsctl stop crs

And then I reinstall the acfs module with following acfsroot command

[root@oda-server~]# which acfsroot
/u01/app/19.0.0.0/grid/bin/acfsroot

[root@oda-server~]# /u01/app/19.0.0.0/grid/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@oda-server~]#

Running again the lsmod command, I see that things are better

[root@oda-server~]# /sbin/lsmod | grep oracle
oracleacfs           5828608  0
oracleadvm           1204224  0
oracleoks             724992  2 oracleadvm,oracleacfs
oracleafd             229376  0
[root@oda-server~]#

After I reboot my server and all databases and the ASM Proxy Instance become up and running. I just wanted to mention this issue I had and hope this will help.

Conclusion

Some recommendations
Remove all rpm packages that were manually installed
Verify that you have enough space on / , /u01 and /opt filesystems. /opt and /u01 can be increased online
Launch the OS upgrade from the ILOM console
I hope this blog will help.

Cet article Patching Oracle Database Appliance From 18.8 to 19.6 est apparu en premier sur Blog dbi services.

Documentum – D2-REST Privileged Client utility 16.7+ fails with class not found

Fri, 2020-05-29 13:30

In previous blogs, I already talked about how to fix issues with the D2 Privileged Client utility (for 16.4.0, 16.5.0 and 16.5.1) as well as how to execute/use this utility for automatic/silent registration & approval of DFC Client IDs. I’m now back to this topic because I had the opportunity recently to work with D2-REST 16.7.1 so I took some time to compare the 16.7.0, 16.7.1 and the newly released 20.2 versions as well of this utility. As you can expect by reading this blog, it’s not error free, otherwise I wouldn’t be writing something.

In the previous blog, I mentioned five different issues with the utility: no shebang, incorrect first line, log4j configuration using a hardcoded Windows path, DOS format for the shell script and finally the unavailability of secure communication support. The first bad news is that despite my SR opened with the OpenText Support a year ago, and their bugs opened internally, the first four issues haven’t been fixed until the 20.2 version (here I’m only talking about D2-REST which usually doesn’t receive a lot of update/patch, if any at all). All the versions of the D2-REST war files available on the OpenText site except 20.2 do not include an updated version of the shell script. It would be so easy to fix all these 4 issues but nobody is taking the time to, apparently. On the other hand, the non-support of the secure communication has fortunately been fixed on the 16.7.1 version. So starting with this one, you can now use the D2 Privileged Client utility without having to first switch your client dfc.properties to use native communication, you can just run it even with secure channels. That’s the status about the issues I discussed in my previous blog already but there is more and that’s the main goal of this blog!

Starting with 16.7.0, OpenText apparently added a second jar file in the same folder with the Java source of the D2 Privileged Client utility. Which is fine, right? Well in principle yes, that’s good because we can actually see what is done and how it’s done using the source code… But this second jar file actually cause a new issue! After fixing the first issues discussed earlier and trying to run the utility:

[weblogic@wsd2rest-0 ~]$ cd $APPLICATIONS
[weblogic@wsd2rest-0 applications]$ 
[weblogic@wsd2rest-0 applications]$ ls -l
total 4
drwxr-x---. 8 weblogic weblogic 4096 Mar 22 08:38 D2-REST
[weblogic@wsd2rest-0 applications]$ 
[weblogic@wsd2rest-0 applications]$ cd D2-REST/utils/d2privilegedclient/
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ls -l
total 44
-rw-r-----. 1 weblogic weblogic 18469 Mar 22 08:39 D2PrivilegedClientUtil-16.7.1.jar
-rw-r-----. 1 weblogic weblogic  9431 Mar 22 08:39 D2PrivilegedClientUtil-16.7.1-sources.jar
-rw-r-----. 1 weblogic weblogic  1214 Mar 22 08:39 D2PrivilegedClientUtil.cmd
-rwx------. 1 weblogic weblogic  1119 Mar 22 09:11 D2PrivilegedClientUtil.sh
-rw-r-----. 1 weblogic weblogic  1141 Mar 22 09:11 D2PrivilegedClientUtil.sh.orig
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ./D2PrivilegedClientUtil.sh -d ${gr} -u ${user} -p ${pw}
Running D2PrivilegedClientUtil
Error: Could not find or load main class com.opentext.d2.util.D2PrivilegedClientUtil
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ cat D2PrivilegedClientUtil.sh
#!/bin/bash

# Run this script with no arguments to get command line usage

# Specify location of WEB-INF folder
webinfdir=../../WEB-INF

echo "Running D2PrivilegedClientUtil"

# The following will suppress DFC warnings about non-existing log4j configuration
# if the specified log4j.properties file actually exists ... adjust as desired.
log4j_configuration=${webinfdir}/classes/log4j.properties

# Note: In order to run this utility on a Developer system, the following jar files must
# be copied from the corresponding DocumentumCoreProject/dfs*/dfs-sdk*/lib/java/utils
# folder to WEB-INF/lib:
#
#   aspectjrt.jar
#   log4j.jar

osname=`uname`
key=`expr substr $osname 1 6`
if [ `expr match "$key" "CYGWIN"` != "6" ]
then
   # Linux
    utiljar=`find D2PrivilegedClientUtil*.jar`
    classpath="${utiljar}:${webinfdir}/classes/:${webinfdir}/lib/*"
else
   # Cygwin
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar`
    classpath="${utiljar};${webinfdir}/classes/;${webinfdir}/lib/*"
fi

java -Dlog4j.configuration="${log4j_configuration}" -cp "${classpath}" com.opentext.d2.util.D2PrivilegedClientUtil $*
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ sed -i 's,^java ,echo "------"\necho "The Classpath is set to >>>>${classpath}<<<<"\necho "------"\n\n&,' D2PrivilegedClientUtil.sh
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ cat D2PrivilegedClientUtil.sh
#!/bin/bash

# Run this script with no arguments to get command line usage

# Specify location of WEB-INF folder
webinfdir=../../WEB-INF

echo "Running D2PrivilegedClientUtil"

# The following will suppress DFC warnings about non-existing log4j configuration
# if the specified log4j.properties file actually exists ... adjust as desired.
log4j_configuration=${webinfdir}/classes/log4j.properties

# Note: In order to run this utility on a Developer system, the following jar files must
# be copied from the corresponding DocumentumCoreProject/dfs*/dfs-sdk*/lib/java/utils
# folder to WEB-INF/lib:
#
#   aspectjrt.jar
#   log4j.jar

osname=`uname`
key=`expr substr $osname 1 6`
if [ `expr match "$key" "CYGWIN"` != "6" ]
then
   # Linux
    utiljar=`find D2PrivilegedClientUtil*.jar`
    classpath="${utiljar}:${webinfdir}/classes/:${webinfdir}/lib/*"
else
   # Cygwin
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar`
    classpath="${utiljar};${webinfdir}/classes/;${webinfdir}/lib/*"
fi

echo "------"
echo "The Classpath is set to >>>>${classpath}<<<<"
echo "------"

java -Dlog4j.configuration="${log4j_configuration}" -cp "${classpath}" com.opentext.d2.util.D2PrivilegedClientUtil $*
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ./D2PrivilegedClientUtil.sh -d ${gr} -u ${user} -p ${pw}
Running D2PrivilegedClientUtil
------
The Classpath is set to >>>>D2PrivilegedClientUtil-16.7.1.jar
D2PrivilegedClientUtil-16.7.1-sources.jar:../../WEB-INF/classes/:../../WEB-INF/lib/*<<<<
------
Error: Could not find or load main class com.opentext.d2.util.D2PrivilegedClientUtil
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$

 

As you can see above, the environment variable “${classpath}” is actually over 2 lines… This breaks the java command that is supposed to execute the utility since it cannot find the correct class file in the classpath. I opened another SR with the OpenText Support (SR#4480014) for this new issue and to explain them what is the issue and how to solve it. While writing this blog, the ticket is still under investigation on OpenText side and I have no estimate on when it will be fixed exactly but don’t expect it before at least 20.3 or maybe even 20.4 for D2-REST since 20.3 is approaching fast… For D2, it might be introduced in a patch for the 20.2 already, let’s see.

Anyway, until then and to fix the utility, it’s really simple: either you just rename the source jar file so it doesn’t get included in the classpath (simplest) or you update the command that fetch the jar files to automatically set the classpath:

[weblogic@wsd2rest-0 d2privilegedclient]$ ### First solution: rename the source jar file:
[weblogic@wsd2rest-0 d2privilegedclient]$ mv D2PrivilegedClientUtil-16.7.1-sources.jar bck_D2PrivilegedClientUtil-16.7.1-sources.jar
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ./D2PrivilegedClientUtil.sh -d ${gr} -u ${user} -p ${pw}
Running D2PrivilegedClientUtil
------
The Classpath is set to >>>>D2PrivilegedClientUtil-16.7.1.jar:../../WEB-INF/classes/:../../WEB-INF/lib/*<<<<
------
2020-03-22 09:25:52,701 UTC [INFO ] (main) - c.e.x.r.c.d.dao.impl.PrivilegeClientDaoImpl   : Checking dm_client_rights for dfc: dfc_Il39fcV3grqLj46AqOXV6ChBaEWk
D2-REST_wsd2rest-0_hBaEWk
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ mv bck_D2PrivilegedClientUtil-16.7.1-sources.jar D2PrivilegedClientUtil-16.7.1-sources.jar
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ### Second solution: change the classpath (1st possibility)
[weblogic@wsd2rest-0 d2privilegedclient]$ grep "utiljar=" D2PrivilegedClientUtil.sh
    utiljar=`find D2PrivilegedClientUtil*.jar`
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar`
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ sed -i 's,\(utiljar=.*D2PrivilegedClientUtil.*jar\),\1 | grep -v sources,' D2PrivilegedClientUtil.sh
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ grep "utiljar=" D2PrivilegedClientUtil.sh
    utiljar=`find D2PrivilegedClientUtil*.jar | grep -v sources`
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar | grep -v sources`
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ./D2PrivilegedClientUtil.sh -d ${gr} -u ${user} -p ${pw}
Running D2PrivilegedClientUtil
------
The Classpath is set to >>>>D2PrivilegedClientUtil-16.7.1.jar:../../WEB-INF/classes/:../../WEB-INF/lib/*<<<<
------
2020-03-22 09:28:12,043 UTC [INFO ] (main) - c.e.x.r.c.d.dao.impl.PrivilegeClientDaoImpl   : Checking dm_client_rights for dfc: dfc_Il39fcV3grqLj46AqOXV6ChBaEWk
D2-REST_wsd2rest-0_hBaEWk
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ sed -i 's,\(utiljar=.*D2PrivilegedClientUtil.*jar\).*sources,\1,' D2PrivilegedClientUtil.sh
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ### Second solution: change the classpath (2nd possibility)
[weblogic@wsd2rest-0 d2privilegedclient]$ grep "utiljar=" D2PrivilegedClientUtil.sh
    utiljar=`find D2PrivilegedClientUtil*.jar`
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar`
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ sed -i 's,\(utiljar=.*D2PrivilegedClientUtil.*jar\),\1 | xargs | sed "s@ @:@",' D2PrivilegedClientUtil.sh
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ grep "utiljar=" D2PrivilegedClientUtil.sh
    utiljar=`find D2PrivilegedClientUtil*.jar | xargs | sed "s@ @:@"`
    utiljar=`/usr/bin/find.exe D2PrivilegedClientUtil*.jar | xargs | sed "s@ @:@"`
[weblogic@wsd2rest-0 d2privilegedclient]$ 
[weblogic@wsd2rest-0 d2privilegedclient]$ ./D2PrivilegedClientUtil.sh -d ${gr} -u ${user} -p ${pw}
Running D2PrivilegedClientUtil
------
The Classpath is set to >>>>D2PrivilegedClientUtil-16.7.1.jar:D2PrivilegedClientUtil-16.7.1-sources.jar:../../WEB-INF/classes/:../../WEB-INF/lib/*<<<<
------
2020-03-22 09:30:08,477 UTC [INFO ] (main) - c.e.x.r.c.d.dao.impl.PrivilegeClientDaoImpl   : Checking dm_client_rights for dfc: dfc_Il39fcV3grqLj46AqOXV6ChBaEWk
D2-REST_wsd2rest-0_hBaEWk
[weblogic@wsd2rest-0 d2privilegedclient]$

 

There are plenty of ways to have this fixed depending on the distribution you are running on. I haven’t tested on cygwin because, honestly, who in 2020 is using cygwin? The simplest and most portable way to avoid the issue is obviously the first one (renaming the source jar file) because it doesn’t change the content of the shell script. Therefore, I would personally go with this one until OpenText fix the product.

 

Cet article Documentum – D2-REST Privileged Client utility 16.7+ fails with class not found est apparu en premier sur Blog dbi services.

pgBackRest – a simple privileges-related issue

Fri, 2020-05-29 07:07

I recently had to replace Barman by pgBackRest as Backup & Recovery solution for one of my customer.
The purpose of this blog is not to show how I made the setup (some dbi blogs already exists for that), but only to share a small issue I encountered during the creation of the Stanza :

postgres@pgbackrest:/ [cms2] pgbackrest --stanza=cms2_primary stanza-create
ERROR: [058]: version '11' and path '(null)' queried from cluster do not match version '11' and '/services/data/pgdata/cms2' read from '/services/data/pgdata/cms2/global/pg_control'
HINT: the pg1-path and pg1-port settings likely reference different clusters.
postgres@pgbackrest:/ [cms2]

According to the error message it seems that the path to the data directory can not be retrieved.
I’m using a dedicated user to perform the backup operations (instead of the “postgres” user) :

postgres@pgbackrest:/ [cms2] cat /etc/pgbackrest.conf
[global] repo1-host=pgserver
repo1-host-user=postgres
log-level-file=detail


[cms2_primary] pg1-path=/services/data/pgdata/cms2
pg1-socket-path=/tmp
pg1-user=backupuser
postgres@pgbackrest:/ [cms2]

The source of the problem is that backupuser is not allowed to read instance settings :

(postgres@[local]:5432) [postgres] > \c postgres backupuser
You are now connected to database "postgres" as user "backupuser".
(backupuser@[local]:5432) [postgres] > select setting from pg_catalog.pg_settings where name = 'data_directory';
setting
---------
(0 rows)

So, to solve this it’s required to grant the “pg_read_all_settings” role to the user :

(postgres@[local]:5432) [postgres] > grant pg_read_all_settings to backupuser;
GRANT ROLE
(postgres@[local]:5432) [postgres] > \c postgres backupuser
You are now connected to database "postgres" as user "backupuser".
(backupuser@[local]:5432) [postgres] > select setting from pg_catalog.pg_settings where name = 'data_directory';
setting
----------------------------
/services/data/pgdata/cms2
(1 row)

Some other privileges are also required to allow the user to perform backup/restore operations :

(postgres@[local]:5432) [postgres] > grant execute on function pg_create_restore_point(text) to backupuser;
GRANT
(postgres@[local]:5432) [postgres] > grant execute on function pg_start_backup(text, boolean, boolean) to backupuser;
GRANT
(postgres@[local]:5432) [postgres] > grant execute on function pg_stop_backup(boolean, boolean) to backupuser;
GRANT
(postgres@[local]:5432) [postgres] > grant execute on function pg_create_restore_point(text) to backupuser;
GRANT
(postgres@[local]:5432) [postgres] > grant execute on function pg_switch_wal() to backupuser;
GRANT
(postgres@[local]:5432) [postgres] >

With the correct privilege the Stanza can now be created :

postgres@pgbackrest:/ [cms2] pgbackrest --stanza=cms2_primary stanza-create --log-level-console=detail
2020-04-28 20:36:12.252 P00 INFO: stanza-create command begin 2.26: --log-level-console=detail --pg1-path=/services/data/pgdata/cms2 --pg1-socket-path=/tmp --pg1-user=backupuser --repo1-path=/var/lib/pgbackrest --stanza=cms2_primary
2020-04-28 20:36:12.785 P00 INFO: stanza-create command end: completed successfully (534ms)
postgres@pgbackrest:/ [cms2] postgres@pgbackrest:/ [cms2] pgbackrest --stanza=cms2_primary check --log-level-console=detail
2020-04-28 20:36:48.379 P00 INFO: check command begin 2.26: --log-level-console=detail --pg1-path=/services/data/pgdata/cms2 --pg1-socket-path=/tmp --pg1-user=backupuser --repo1-path=/var/lib/pgbackrest --stanza=cms2_primary
2020-04-28 20:36:49.887 P00 INFO: WAL segment 000000010000004E00000036 successfully archived to '/var/lib/pgbackrest/archive/cms2_primary/11-1/000000010000004E/000000010000004E00000036-5164039a67a3f865cfc6c188b9cd02d38c2a1c55.gz'
2020-04-28 20:36:49.887 P00 INFO: check command end: completed successfully (1509ms)
postgres@pgbackrest:/ [cms2]

And then the first initial full backup can be done :

postgres@pgbackrest:/ [cms2] pgbackrest backup --type=full --stanza=cms2_primary --log-level-console=detail
2020-04-28 23:21:50.546 P00 INFO: backup command begin 2.26: --pg1-host=pgserver --pg1-path=/services/data/pgdata/cms2 --pg1-port=5432 --pg1-socket-path=/tmp --pg1-user=backupuser --repo1-path=/services/data/pgdata --repo1-retention-full=2 --stanza=cms2_primary --start-fast --type=diff
2020-04-28 23:21:51.303 P00 WARN: no prior backup exists, diff backup has been changed to full
2020-04-28 23:21:51.303 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes
2020-04-28 23:21:52.012 P00 INFO: backup start archive = 000000010000004F00000007, lsn = 4F/1C000028
2020-04-28 23:24:00.160 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/23633/58871.8 (2GB, 2%) checksum 7573c5848b1d2352666d3c3937c818e92654876e
2020-04-28 23:25:58.255 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/23633/58871.7 (2GB, 4%) checksum 09149d631f2b703b47e928c296aec70f21a74a00
2020-04-28 23:28:04.379 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/23633/58871.6 (2GB, 6%) checksum f4196a0f54037b18577b8f3f08e61b6b7f84deb1
2020-04-28 23:29:57.413 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/23633/58871.5 (2GB, 9%) checksum 9e334b2366f5bee850c782a62827e0fe1c83e1df
...
...
2020-04-29 00:54:48.900 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/1/13084 (0B, 100%)
2020-04-29 00:54:48.906 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/1/13079 (0B, 100%)
2020-04-29 00:54:48.910 P01 INFO: backup file pgserver:/services/data/pgdata/cms2/base/1/13074 (0B, 100%)
2020-04-29 00:54:48.929 P00 INFO: full backup size = 88.1GB
2020-04-29 00:54:48.930 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive
2020-04-29 00:54:49.232 P00 INFO: backup stop archive = 000000010000004F00000008, lsn = 4F/20000050
2020-04-29 00:54:49.339 P00 INFO: check archive for segment(s) 000000010000004F00000007:000000010000004F00000008
2020-04-29 00:54:50.011 P00 INFO: new backup label = 20200428-232153F
2020-04-29 00:54:50.452 P00 INFO: backup command end: completed successfully (5579906ms)
postgres@pgbackrest:/ [cms2]

Fairly easy… but not documented. So I hope this helps.

Cet article pgBackRest – a simple privileges-related issue est apparu en premier sur Blog dbi services.

Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled

Thu, 2020-05-28 02:01

I have been recently deploying dbvisit version 9.0.10 on Standard Edition SE2 18.7. Graceful switchover failed with error : ORA-00600: Interner Fehlercode, Argumente: [17090], [], [], [], [], [], [], [], [], [],[], [] (DBD ERROR: OCIStmtExecute)

Problem description

Running a graceful switchover will fail with ORA-00600 when converting the standby database if Unified Auditing is enabled. Output would be the following one :

oracle@ODA01:/u01/app/dbvisit/standby/ [TESTDB] ./dbvctl -d TESTDB -o switchover
=============================================================
Dbvisit Standby Database Technology (9.0.10_0_g064b53e) (pid 15927)
dbvctl started on ODA01-replica: Thu May 7 16:22:28 2020
=============================================================
 
>>> Starting Switchover between ODA01-replica and SEERP1SOP010-replica
 
Running pre-checks ... done
Pre processing ... done
Processing primary ... done
Processing standby ... done
Converting standby ... failed
Performing rollback ... done
 
>>> Database on server ODA01-replica is still a Primary Database
>>> Database on server SEERP1SOP010-replica is still a Standby Database
 
 
<<<>>>
PID:15927
TRACEFILE:15927_dbvctl_switchover_TESTDB_202005071622.trc
SERVER:ODA01-replica
ERROR_CODE:1
Remote execution error on SEERP1SOP010-replica.
 
==============Remote Output start: SEERP1SOP010-replica===============
Standby file
/u02/app/oracle/oradata/TESTDB_DC13/TESTDB_DC13/datafile/o1_mf_system_h5bcgo03_.dbf renamed to /u02/app/oracle/oradata/TESTDB_DC41/TESTDB_DC41/datafile/o1_mf_system_hbzpn4l6_.dbf in
database TESTDB.
Standby file
...
...
...
/u02/app/oracle/oradata/TESTDB_DC41/TESTDB_DC41/datafile/o1_mf_ts_edc_d_hbzpp9yw_.dbf in database TESTDB.
Standby file /u04/app/oracle/redo/TESTDB/TESTDB_DC13/onlinelog/o1_mf_2_h5bfq0xq_.log
renamed to /u01/app/dbvisit/standby/gs/TESTDB/X.DBVISIT.REDO_2.LOG in database TESTDB.
<<<>>>
PID:36409
TRACEFILE:36409_dbvctl_f_gs_convert_standby_TESTDB_202005071626.trc
SERVER:SEERP1SOP010-replica
ERROR_CODE:600
ORA-00600: Interner Fehlercode, Argumente: [17090], [], [], [], [], [], [], [], [], [],[], [] (DBD ERROR: OCIStmtExecute)
>>>> Dbvisit Standby terminated <<<>>> Dbvisit Standby terminated <<<<

Troubleshooting

The problem is known by Dbvisit. Their engineering team is already working on a fix that is planned to be released in version 9.1.XX of Dbvisit.

Workaround would be to disable Unified Auditing.

Disable Unified Auditing

In this part I will describe how to disable Unified Auditing. It is an ODA so my environment is using Oracle Restart.

1- Shutdown the database

Database shutdown is executed with Oracle user using srvctl :
oracle@ODA01:/u01/app/dbvisit/standby/ [rdbms18000_1] srvctl stop database -d TESTDB_DC13

2- Shutdown listener

Listener is own by grid user and stoping the listener will be executed using srvctl :
grid@ODA01:/home/grid/ [+ASM1] srvctl stop listener -listener LISTENER

3- Relink oracle executable

For the current database home, the oracle executable needs to be relinked :
oracle@ODA01:/u01/app/dbvisit/standby/ [rdbms18000_1] cd $ORACLE_HOME/rdbms/lib
 
oracle@ODA01:/u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ [rdbms18000_1] make -f ins_rdbms.mk uniaud_off ioracle ORACLE_HOME=$ORACLE_HOME
/usr/bin/ar d /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a kzaiang.o
/usr/bin/ar cr /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/kzanang.o
chmod 755 /u01/app/oracle/product/18.0.0.0/dbhome_1/bin
- Linking Oracle
rm -f /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle
/u01/app/oracle/product/18.0.0.0/dbhome_1/bin/orald -o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle -m64 -z noexecstack -Wl,--disable-new-dtags -L/u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib/stubs/ -Wl,-E /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/opimai.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ssoraed.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/ttcsoi.o -Wl,--whole-archive -lperfsrv18 -Wl,--no-whole-archive /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/nautab.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naeet.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naect.o /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/naedhs.o /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/config.o -ldmext -lserver18 -lodm18 -lofs -lcell18 -lnnet18 -lskgxp18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lclient18 -lvsnst18 -lcommon18 -lgeneric18 -lknlopt -loraolap18 -lskjcx18 -lslax18 -lpls18 -lrt -lplp18 -ldmext -lserver18 -lclient18 -lvsnst18 -lcommon18 -lgeneric18 `if [ -f /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/libavserver18.a ] ; then echo "-lavserver18" ; else echo "-lavstub18"; fi` `if [ -f /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/libavclient18.a ] ; then echo "-lavclient18" ; fi` -lknlopt -lslax18 -lpls18 -lrt -lplp18 -ljavavm18 -lserver18 -lwwg `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnro18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnnzst18 -lzt18 -lztkg18 -lmm -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lztkg18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnro18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/ldflags` -lncrypt18 -lnsgr18 -lnzjs18 -ln18 -lnl18 -lngsmshd18 -lnnzst18 -lzt18 -lztkg18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 `if /usr/bin/ar tv /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/libknlopt.a | grep "kxmnsd.o" > /dev/null 2>&1 ; then echo " " ; else echo "-lordsdo18 -lserver18"; fi` -L/u01/app/oracle/product/18.0.0.0/dbhome_1/ctx/lib/ -lctxc18 -lctx18 -lzx18 -lgx18 -lctx18 -lzx18 -lgx18 -lordimt -lclscest18 -loevm -lclsra18 -ldbcfg18 -lhasgen18 -lskgxn2 -lnnzst18 -lzt18 -lxml18 -lgeneric18 -locr18 -locrb18 -locrutl18 -lhasgen18 -lskgxn2 -lnnzst18 -lzt18 -lxml18 -lgeneric18 -lgeneric18 -lorazip -loraz -llzopro5 -lorabz2 -lipp_z -lipp_bz2 -lippdcemerged -lippsemerged -lippdcmerged -lippsmerged -lippcore -lippcpemerged -lippcpmerged -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lsnls18 -lunls18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lcore18 -lsnls18 -lnls18 -lxml18 -lcore18 -lunls18 -lsnls18 -lnls18 -lcore18 -lnls18 -lasmclnt18 -lcommon18 -lcore18 -ledtn18 -laio -lons -lfthread18 `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/sysliblist` -Wl,-rpath,/u01/app/oracle/product/18.0.0.0/dbhome_1/lib -lm `cat /u01/app/oracle/product/18.0.0.0/dbhome_1/lib/sysliblist` -ldl -lm -L/u01/app/oracle/product/18.0.0.0/dbhome_1/lib `test -x /usr/bin/hugeedit -a -r /usr/lib64/libhugetlbfs.so && test -r /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/shugetlbfs.o && echo -Wl,-zcommon-page-size=2097152 -Wl,-zmax-page-size=2097152 -lhugetlbfs`
rm -f /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle
mv /u01/app/oracle/product/18.0.0.0/dbhome_1/rdbms/lib/oracle /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle
chmod 6751 /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle
(if [ ! -f /u01/app/oracle/product/18.0.0.0/dbhome_1/bin/crsd.bin ]; then \
getcrshome="/u01/app/oracle/product/18.0.0.0/dbhome_1/srvm/admin/getcrshome" ; \
if [ -f "$getcrshome" ]; then \
crshome="`$getcrshome`"; \
if [ -n "$crshome" ]; then \
if [ $crshome != /u01/app/oracle/product/18.0.0.0/dbhome_1 ]; then \
oracle="/u01/app/oracle/product/18.0.0.0/dbhome_1/bin/oracle"; \
$crshome/bin/setasmgidwrap oracle_binary_path=$oracle; \
fi \
fi \
fi \
fi\
);

4- Start listener

With grid user, execute :
grid@ODA01:/home/grid/ [+ASM1] srvctl start listener -listener LISTENER

5- Start database

With oracle user, execute :
oracle@ODA01:/home/oracle/ [rdbms18000_2] srvctl start database -d TESTDB_DC13

6- Deactivate any existing Unified Auditing policies

oracle@ODA01:/home/oracle/ [SALESPRD] sqh
 
SQL> set line 300
SQL> column user_name format a20
SQL> column policy_name format a50
SQL> column entity_name format a50
SQL> select * from audit_unified_enabled_policies;
 
USER_NAME POLICY_NAME ENABLED ENABLED_OPTION ENTITY_NAME ENTITY_ SUC FAI
-------------------- -------------------------------------------------- ------- --------------- -------------------------------------------------- ------- --- ---
ALL USERS ORA_SECURECONFIG BY BY USER ALL USERS USER YES YES
ALL USERS ORA_LOGON_FAILURES BY BY USER ALL USERS USER NO YES
 
SQL> noaudit policy ORA_SECURECONFIG;
 
NOAUDIT wurde erfolgreich ausgefuhrt.
 
SQL> noaudit policy ORA_LOGON_FAILURES;
 
NOAUDIT wurde erfolgreich ausgefuhrt.
 
SQL> select * from audit_unified_enabled_policies;
 
Es wurden keine Zeilen ausgewahlt

Cet article Dbvisit – Switchover failing with ORA-00600 error due to Unified Auditing been enabled est apparu en premier sur Blog dbi services.

Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

Thu, 2020-05-28 01:58

During ODA deployment I could see that starting 18.7, immediately after reimaging or patching the ODA, I was getting some regular errors in the root mail box. The error message came every hour at 13 minutes and 43 minutes.

Problem analysis

ksplice is now implemented and use in ODA Version 18.7. It is an open-source extension of the Linux kernel that allows security patches to be applied to a running kernel without the need for reboots, avoiding downtimes and improving availability.

Unfortunately, there is some implementation bug and an email alert is generated every 30 mins in the root linux user mailbox.

The message error is the following one :
1 Cron Daemon Mon Jan 6 18:43 26/1176 "Cron export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksp"
 
 
Message 910:
From root@ODA01.local Mon Jan 6 17:43:02 2020
Return-Path:
Date: Mon, 6 Jan 2020 17:43:02 +0100
From: root@ODA01.local (Cron Daemon)
To: root@ODA01.local
Subject: Cron export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksplice --cron xen upgrade)
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
X-Cron-Env:
X-Cron-Env:
X-Cron-Env:
X-Cron-Env:
X-Cron-Env:
X-Cron-Env:
Status: R
 
Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found

During my troubleshooting, I could find some community discussion : https://community.oracle.com/thread/4300505?parent=MOSC_EXTERNAL&sourceId=MOSC&id=4300505
This is considered by oracle as a bug in the 18.7 Release and tracked under Bug 30147824 :
Bug 30147824 – ENABLING AUTOINSTALL=YES WITH KSPLICE SENDS FREQUENT EMAIL TO ROOT ABOUT MISSING OL6_X86_64_USERSPACE_KSPLICE REPO

Workaround

Waiting for a final solution, following workaround can be implemented. Oracle Workaround is just not to execute ksplice.

[root@ODA01 ~]# cd /etc/cron.d
 
[root@ODA01 cron.d]# ls -l
total 20
-rw-r--r--. 1 root root 113 Aug 23 2016 0hourly
-rw-r--r--. 1 root root 818 Dec 18 19:08 ksplice
-rw-------. 1 root root 108 Mar 22 2017 raid-check
-rw-------. 1 root root 235 Jan 25 2018 sysstat
-rw-r--r--. 1 root root 747 Dec 18 19:08 uptrack
 
[root@ODA01 cron.d]# more ksplice
# Replaced by Ksplice on 2019-12-18
# /etc/cron.d/ksplice: cron job for the Ksplice client
#
# PLEASE DO NOT MODIFY THIS CRON JOB.
# Instead, contact Ksplice Support at ksplice-support_ww@oracle.com.
#
# The offsets below are chosen specifically to distribute server load
# and allow for Ksplice server maintenance windows. This cron job
# also only contacts the Ksplice server every Nth time it runs,
# depending on a load control setting on the Ksplice server.
#
# If you would like to adjust the frequency with which your
# systems check for updates, please contact Ksplice Support at
# ksplice-support_ww@oracle.com
13,43 * * * * root export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin && [ -x /usr/bin/ksplice ] && (/usr/bin/ksplice --cron user upgrade; /usr/bin/ksplice --cron xen upgrade)
 
[root@ODA01 cron.d]# mv ksplice /root/Extras/
 
[root@ODA01 cron.d]# ls -l
total 16
-rw-r--r--. 1 root root 113 Aug 23 2016 0hourly
-rw-------. 1 root root 108 Mar 22 2017 raid-check
-rw-------. 1 root root 235 Jan 25 2018 sysstat
-rw-r--r--. 1 root root 747 Dec 18 19:08 uptrack
 
[root@ODA01 cron.d]# ls -l /root/Extras/ksplice
-rw-r--r--. 1 root root 818 Dec 18 19:08 /root/Extras/ksplice

Cet article Error getting repository data for ol6_x86_64_userspace_ksplice, repository not found est apparu en premier sur Blog dbi services.

Documentum Upgrade – Switch from Lockbox to AEK key

Wed, 2020-05-27 13:00

As you probably know already, Documentum removed the support for RSA Libraries and RSA Lockbox starting with Documentum version 16.7. This means that if you are planning to upgrade to 16.7 or higher versions, you will first need to remove the lockbox from your installation and extract your AEK key from it before starting the upgrade process. To be completely exact, upgrading to 16.7 or above in place (using VMs for example) is normally going to do that for you but if you are using the docker images from OpenText, it’s not going to happen. Therefore, it is always good to know how to do it anyway.

The process to “downgrade” from the Lockbox to the AEK key (the other way around wasn’t really an upgrade in terms of security anyway…) is pretty simple and if I’m not mistaken, it is now in the Upgrade & Migration Guide of Documentum 16.7. This can be prepared while the repository is running but it will require a quick restart to be applied. If you are facing any issue, you can also go back to the Lockbox first and figuring out what the issue is later.

It’s very simple and straightforward to extract the AEK key from the Lockbox but there is one requirement to met. Indeed, you will need the latest version of the dm_crypto_create utility which has been packaged starting with:

  • Documentum 7.2 P42
  • Documentum 7.3 P23
  • Documentum 16.4 P10

If you are using an older version or an older patch, you will first need to upgrade/patch to one of these versions. It **might** be possible to just take the binary from one of these patches and use it on older versions of Documentum but that is probably not supported and it would need to be tested first to make sure it doesn’t break in the process. If you want to test that, just make sure to use the correct version of Lockbox libraries (version 3.1 for CS 7.3 and lower // version 4.0 for CS 16.4).

Once you are with the correct version/patch level, the extraction is just one simple command. Therefore, it can be automated easily in the upgrade process. Before starting, let’s setup/prepare the environment and making sure all is currently working with the Lockbox. I’m using a demo environment I built a few weeks on Kubernetes to show how it works:

[dmadmin@cs-0 ~]$ cd $DOCUMENTUM/dba
[dmadmin@cs-0 dba]$
[dmadmin@cs-0 dba]$ cp -R secure secure_$(date "+%Y%m%d")
[dmadmin@cs-0 dba]$
[dmadmin@cs-0 dba]$ cd secure
[dmadmin@cs-0 secure]$ ls -ltr
total 24
-rw-rw-r-- 1 dmadmin dmadmin 3750 Mar 30 13:30 lockbox.lb
-rw-rw-r-- 1 dmadmin dmadmin    3 Mar 30 13:30 lockbox.lb.FCD
drwxrwx--- 2 dmadmin dmadmin  152 Mar 30 17:41 ldapdb
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ gr_repo="GR_REPO1"
[dmadmin@cs-0 secure]$ s_ini="$DOCUMENTUM/dba/config/${gr_repo}/server.ini"
[dmadmin@cs-0 secure]$ 
[dmadmin@cs-0 secure]$ grep "crypto" ${s_ini}
crypto_mode = AES256_RSA1024_SHA256
crypto_keystore = Local
crypto_lockbox = lockbox.lb
crypto_keyname = CSaek
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ lb_name=$(grep "^crypto_lockbox" ${s_ini} | sed 's,crypto_lockbox[[:space:]]*=[[:space:]]*,,')
[dmadmin@cs-0 secure]$ aek_name=$(grep "^crypto_keyname" ${s_ini} | sed 's,crypto_keyname[[:space:]]*=[[:space:]]*,,')
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ echo "Lockbox Name: ${lb_name} -- AEK Name: ${aek_name}"
Lockbox Name: lockbox.lb -- AEK Name: CSaek
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ read -s -p "  --> Please put here the Lockbox Passphrase and press Enter: " lb_pp; echo
  --> Please put here the Lockbox Passphrase and press Enter:
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ read -s -p "  --> Please put here the AEK Passphrase and press Enter: " aek_pp; echo
  --> Please put here the AEK Passphrase and press Enter:
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ # If the below command returns '0', then the Lockbox name, AEK passphrase and the AEK Key name are correct
[dmadmin@cs-0 secure]$ # but it doesn't really test the Lockbox passphrase
[dmadmin@cs-0 secure]$ dm_crypto_create -lockbox ${lb_name} -lockboxpassphrase ${lb_pp} -keyname ${aek_name} -passphrase ${aek_pp} -check


Key - CSaek uses algorithm AES_256_CBC.

** An AEK store with the given passphrase exists in lockbox lockbox.lb and got status code returned as '0'.
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ # If the below command returns 'Reset host done', then the Lockbox name and Lockbox passphrase are both correct
[dmadmin@cs-0 secure]$ dm_crypto_manage_lockbox -lockbox ${lb_name} -lockboxpassphrase ${lb_pp} -resetfingerprint
Lockbox lockbox.lb
Lockbox Path $DOCUMENTUM/dba/secure/lockbox.lb
Reset host done
[dmadmin@cs-0 secure]$

 

Once everything is ready & verified, extracting the AEK key is a piece of cake:

[dmadmin@cs-0 secure]$ # AEK passphrase isn't needed to extract the AEK key
[dmadmin@cs-0 secure]$ dm_crypto_create -lockbox ${lb_name} -lockboxpassphrase ${lb_pp} -keyname ${aek_name} -removelockbox -output ${aek_name}


Retrieved key 'CSaek' from lockbox 'lockbox.lb' and stored as '$DOCUMENTUM/dba/secure/CSaek'
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ ls -ltr
total 24
-rw-rw-r-- 1 dmadmin dmadmin 3750 Mar 30 13:30 lockbox.lb
-rw-rw-r-- 1 dmadmin dmadmin    3 Mar 30 13:30 lockbox.lb.FCD
drwxrwx--- 2 dmadmin dmadmin  152 Mar 30 17:41 ldapdb
-rw-r----- 1 dmadmin dmadmin  144 May  8 21:52 CSaek
[dmadmin@cs-0 secure]$

 

When it’s done, a new file has been created with the name specified in the “-output” parameter (${aek_name} above, meaning “CSaek“). The only remaining step is reflecting this change in the server.ini and restarting the repository:

[dmadmin@cs-0 secure]$ grep "crypto" ${s_ini}
crypto_mode = AES256_RSA1024_SHA256
crypto_keystore = Local
crypto_lockbox = lockbox.lb
crypto_keyname = CSaek
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ sed -i 's,^crypto_lockbox,#&,' ${s_ini}
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ grep "crypto" ${s_ini}
crypto_mode = AES256_RSA1024_SHA256
crypto_keystore = Local
#crypto_lockbox = lockbox.lb
crypto_keyname = CSaek
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ 
[dmadmin@cs-0 secure]$ $DOCUMENTUM/dba/dm_shutdown_${gr_repo}
Stopping Documentum server for repository: [GR_REPO1]


        OpenText Documentum iapi - Interactive API interface
        Copyright (c) 2018. OpenText Corporation
        All rights reserved.
        Client Library Release 16.4.0200.0080

Connecting to Server using docbase GR_REPO1.GR_REPO1
[DM_SESSION_I_SESSION_START]info:  "Session 010f123450009905 started for user dmadmin."

Connected to OpenText Documentum Server running Release 16.4.0200.0256  Linux64.Oracle
Session id is s0
API> shutdown,c,T,T
...
OK
API> exit
Bye
Waiting for 90 seconds for server pid, 4188, to disappear.

Fri May  8 21:53:39 UTC 2020: Waiting for shutdown of repository: [GR_REPO1]
Fri May  8 21:53:39 UTC 2020: checking for pid: 4188

Fri May  8 21:53:49 UTC 2020: Waiting for shutdown of repository: [GR_REPO1]
Fri May  8 21:53:49 UTC 2020: checking for pid: 4188

repository: [GR_REPO1] has been shutdown
checking that all children (4214 4218 4219 4263 4305 4348 4542 4557 4584 4766) have shutdown
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ $DOCUMENTUM/dba/dm_start_${gr_repo}
starting Documentum server for repository: [GR_REPO1]
with server log: [$DOCUMENTUM/dba/log/GR_REPO1.log]
server pid: 5055
[dmadmin@cs-0 secure]$
[dmadmin@cs-0 secure]$ head -20 $DOCUMENTUM/dba/log/${gr_repo}.log


    OpenText Documentum Content Server (version 16.4.0200.0256  Linux64.Oracle)
    Copyright (c) 2018. OpenText Corporation
    All rights reserved.

2020-05-08T21:54:02.033346      5055[5055]      0000000000000000        [DM_SERVER_I_START_SERVER]info:  "Docbase GR_REPO1 attempting to open"

2020-05-08T21:54:02.033471      5055[5055]      0000000000000000        [DM_SERVER_I_START_KEY_STORAGE_MODE]info:  "Docbase GR_REPO1 is using database for cryptographic key storage"

2020-05-08T21:54:02.033507      5055[5055]      0000000000000000        [DM_SERVER_I_START_SERVER]info:  "Docbase GR_REPO1 process identity: user(dmadmin)"

2020-05-08T21:54:02.833702      5055[5055]      0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Post Upgrade Processing."

2020-05-08T21:54:02.835032      5055[5055]      0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize Base Types."

2020-05-08T21:54:02.837725      5055[5055]      0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmRecovery."

2020-05-08T21:54:02.846022      5055[5055]      0000000000000000        [DM_SESSION_I_INIT_BEGIN]info:  "Initialize dmACL."

[dmadmin@cs-0 secure]$

 

That’s basically it. You are now running with the plain AEK Key just like before the introduction of the Lockbox a few years ago with the Documentum 7.x version.

 

Maybe one last note on the “dfc.crypto.repository” property of the dfc.properties. As you might know, this property has been introduced to support the move to the lockbox so that Documentum knows which repository should be used for all crypto needs. This parameter should apparently stay, even if you remove the lockbox and continue with the AEK file only because the decryption logic is on the server side and so the DFC Clients still need to know which repository can help on that part. Maybe that will change though…

 

Cet article Documentum Upgrade – Switch from Lockbox to AEK key est apparu en premier sur Blog dbi services.

티베로 – The AWR-like “Tibero Performance Repository”

Tue, 2020-05-26 08:56
By Franck Pachot

.
In a previous post I introduced Tibero as The most compatible alternative to Oracle Database. Compatibility is one thing but one day you will want to compare the performance. I’ll not do any benchmark here but show you how you we can look at the performance with TPR – the Tibero Performance Repository – as an equivalent of AWR – the Oracle Automatic Workload Repository. And, as I needed to run some workload, I attempted to run something that has been written with Oracle Database in mind: the Kevin Closson SLOB – Silly Little Oracle Benchmark. The challenge is to make it run on Tibero and get a TPR report.

SLOB

I’ve downloaded SLOB from:


git clone https://github.com/therealkevinc/SLOB_distribution.git
tar -zxvf SLOB_distribution/2019.11.18.slob_2.5.2.tar.gz

and I’ll detail what I had to change in order to have it running on Tibero.

sqlplus

The client command line is “tbsql” and has a very good compatibility with sqlplus:


[SID=t6a u@h:w]$ tbsql -h
Usage: tbsql [options] [logon] [script]
options
-------
  -h,--help        Displays this information
  -v,--version     Displays version information
  -s,--silent      Enables silent mode. Does not display the 
                   start-up message, prompts and commands
  -i,--ignore      Ignore the login script (eg, tbsql.login)
logon
-----
  [username[/password[@connect_identifier]]]
script
------
  @filename[.ext] [parameter ...]
[SID=t6a u@h:w]$ tbsql
tbSQL 6
TmaxData Corporation Copyright (c) 2008-. All rights reserved.
SQL> help
HELP
 ----
 Displays the Help.
{H[ELP]|?} topic
where topic is
 ! {exclamation} % {percent} @ {at}   @@ {double at}
 / {slash}       ACCEPT      APPEND   ARCHIVE LOG
 CHANGE          CLEAR       COLUMN   CONNECT
 DEFINE          DEL         DESCRIBE DISCONNECT
 EDIT            EXECUTE     EXIT     EXPORT
 HELP            HISTORY     HOST     INPUT
 LIST            LOADFILE    LOOP     LS
 PASSWORD        PAUSE       PING     PRINT
 PROMPT          QUIT        RESTORE  RUN
 SAVE            SET         SHOW     TBDOWN
 SPOOL           START       UNDEFINE VARIABLE
 WHENEVER

However, there are a few things I had to change.

The silent mode is “-s” instead of “-S”

The “-L” (no re-prompt if the connection fails) doesn’t exist: tbsql does not re-prompt, and leaves you in the CLI rather than exiting.

sqlplus does not show feedback for less than 5 rows. tbsql shows it always by default. We can get the same sqlplus output by setting SET FEEDBACK 6

tbsql returns “No Errors” where sqlplus returns “No errors”

You cannot pass additional spaces in the connection string. Sqlplus ignores them bit tbsql complains:

All those were easy to change in the setup.sh and runit.sh scripts.
I actually defined a sqlplus() bash function to handle those:


sqlplus(){
 set -- ${@// /}
 set -- ${@/-L/}
 set -- ${@/-S/-s}
 sed \
 -e '1 i SET FEEDBACK 6' \
 tbsql $* | sed \
 -e 's/No Errors/No errors/'
 echo "--- call stack ---  ${FUNCNAME[0]}()$(basename $0)#${LINENO}$(f=0;while caller $f;do((f++));done|awk '{printf " &2
 echo "--- parameters ---  tbsql $*" >&2
}

As you can see I’ve added the print of the bash callstack which I used to find those issues. Here is the idea:

# print call stack in bash:
echo "=== call stack === ${FUNCNAME[0]}()$(basename $0)#${LINENO}$(f=0;while caller $f;do((f++));done|awk '{printf " <- "$2"()@"$3"#"$1}')" >&2 pic.twitter.com/qhVEMSJkeU

— Franck Pachot (@FranckPachot) January 5, 2020

tnsping

The equivalent of TNSPING is TBPROBE. It takes a host:port and display nothing but returns a 0 return code when the connection is ok or 3 when it failed. Note that there are other status like 1 when the connection is ok but the database is read-only, 2 when in mount or nomount. You see here an architecture difference with Oracle: there is no listener but it is the database that listens on a port.
A little detail with no importance here, my database port is 8629 as mentioned in the previous post but tbprobe actually connects to 8630:


[SID=t6a u@h:w]$ strace -fyye trace=connect,getsockname,recvfrom,sendto tbprobe localhost:8629
connect(4, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
getsockname(4, {sa_family=AF_NETLINK, nl_pid=41494, nl_groups=00000000}, [12]) = 0
sendto(4, {{len=20, type=RTM_GETADDR, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1589797671, pid=0}, {ifa_family=AF_UNSPEC, ...}}, 20, 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 20
connect(4, {sa_family=AF_INET, sin_port=htons(8630), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
getsockname(4127.0.0.1:8630]>, {sa_family=AF_INET, sin_port=htons(50565), sin_addr=inet_addr("127.0.0.1")}, [28->16]) = 0
connect(4, {sa_family=AF_INET, sin_port=htons(8630), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
recvfrom(4127.0.0.1:8630]>, "\0\0\0\0\0\0\0@\0\0\0\0\0\0\0\0", 16, 0, NULL, NULL) = 16
recvfrom(4127.0.0.1:8630]>, "\0\0\0\2\0\0\0\17\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0\6\0\0\0\0\0\0\0\6"..., 64, 0, NULL, NULL) = 64
sendto(4127.0.0.1:8630]>, "\0\0\0\204\0\0\0\f\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 28, 0, NULL, 0) = 28
recvfrom(4127.0.0.1:8630]>, "\0\0\0\204\0\0\0\f\0\0\0\0\0\0\0\0", 16, 0, NULL, NULL) = 16
recvfrom(4127.0.0.1:8630]>, "\0\0\0e\0\0\0\0\0\0\0\1", 12, 0, NULL, NULL) = 12
+++ exited with 0 +++

The return code is correct. What actually happens is that Tibero does not seem to use Out-Of-Band but another port to be able to communicate if the default port is in use. And this is the next port number as mentioned in my instance process list as “listener_special_port”:


[SID=t6a u@h:w]$ cat /home/tibero/tibero6/instance/t6a/.proc.list
Tibero 6   start at (2019-12-16 14:03:00) by 54323
shared memory: 140243894161408 size: 3221225472
shm_key: 847723696 1 sem_key: -1837474876 123 listener_pid: 7026 listener_port: 8629 listener_special_port: 8630 epa_pid: -1
7025 MONP
7027 MGWP
7028 FGWP000
7029 FGWP001
7030 PEWP000
7031 PEWP001
7032 PEWP002
7033 PEWP003
7034 AGNT
7035 DBWR
7036 RCWP

This means that when my database listener is 8629 TBPROBE will return “ok” for 8628 and 8629


[SID=t6a u@h:w]$ for p in {8625..8635} ; do tbprobe  localhost:$p ; echo "localhost:$p -> $?" ; done
localhost:8625 -> 3
localhost:8626 -> 3
localhost:8627 -> 3
localhost:8628 -> 0
localhost:8629 -> 0
localhost:8630 -> 3
localhost:8631 -> 3
localhost:8632 -> 3
localhost:8633 -> 3
localhost:8634 -> 3
localhost:8635 -> 3

Anyway, this has no importance here and I just ignore the tnsping test done by SLOB:


tnsping(){
 true
}
DCL and DDL

Tibero SQL does not allow to create a user with the grant statement and needs explicit CREATE. In setup.sh I did the following replacement


--GRANT CONNECT TO $user IDENTIFIED BY $user;
CREATE USER $user IDENTIFIED BY $user;
GRANT CONNECT TO $user;

The implicit creation of a user with a grant statement is an oraclism that is not very useful anyway.
PARALLEL and CACHE attributes are not allowed at the same place as in Oracle:


-- PARALLEL CACHE PCTFREE 99 TABLESPACE $tablespace
-- STORAGE (BUFFER_POOL KEEP INITIAL 1M NEXT 1M MAXEXTENTS UNLIMITED);
PCTFREE 99 TABLESPACE $tablespace
STORAGE (BUFFER_POOL KEEP INITIAL 1M NEXT 1M MAXEXTENTS UNLIMITED) PARALLEL CACHE;

--NOCACHE PCTFREE 99 TABLESPACE $tablespace
--STORAGE (BUFFER_POOL KEEP INITIAL 1M NEXT 1M MAXEXTENTS UNLIMITED);
PCTFREE 99 TABLESPACE $tablespace
STORAGE (BUFFER_POOL KEEP INITIAL 1M NEXT 1M MAXEXTENTS UNLIMITED);

They just go at the end of the statement and that’s fine.
Tibero has no SYSTEM user by default, I create one. And by the same occasion create the IOPS tablespace:


tbsql sys/tibero <<SQL
 create user system identified by manager;
 grant dba to system;
 create tablespace IOPS ;
SQL

those are the changes I’ve made to be able to run SLOB setup.sh and runit.sh

I also had to change a few things in slob.sql

There is no GET_CPU_TIME() in DBMS_UTILITY so I comment it out:


--v_end_cpu_tm := DBMS_UTILITY.GET_CPU_TIME();
--v_begin_cpu_tm := DBMS_UTILITY.GET_CPU_TIME();

I didn’t check for an equivalent here and just removed it.

The compatibility with Oracle is very good so that queries on V$SESSION are the same, The only thing I changed is the SID userenv that is called TID in Tibero:


SELECT ((10000000000 * (SID + SERIAL#)) + 1000000000000) INTO v_my_serial from v$session WHERE sid = ( select sys_context('userenv','tid') from dual);

I got a TBR-11006: Invalid USERENV parameter before this change.

This session ID is larger so I removed the precision:


--v_my_serial NUMBER(16);
v_my_serial NUMBER;

I got a TBR-5111: NUMBER exceeds given precision. (n:54011600000000000, p:16, s:0) before changing it.

The dbms_output was hanging and I disable it:


--SET SERVEROUTPUT ON   ;

I didn’t try to understand the reason. That’s a very bad example of troubleshooting but I just want it to run now.

Again, without trying to understand further, I replaced all PLS_INTEGER by NUMBER as I got: TBR-5072: Failure converting NUMBER to or from a native type

setup.sh

In slob.conf I changed only UPDATE_PCT to 0 for a read-only workload and changed “statspack” to “awr”, and I was able to create 8 schemas:


./setup.sh IOPS 8 </dev/null

(I redirect /dev/null to stdin because my sqlplus() function above reads it)

runit.sh

Now ready to run SLOB.
This is sufficient to run it but I want to collect statistics from the Tibero Performance Repository (TPR).

Tibero Performance Repository is the equivalent of AWR. The package to manage it is DBMS_TPR instead of DBMS_WORKLOAD_REPOSITORY. It has a CREATE_SNAPSHOT procedure, and a REPORT_LAST_TEXT to generate the report between the two last snapshots, without having to get the snapshot ID.
I’ve replaced the whole statistics() calls in runit.sh by:


tbsql system/manager <<<'exec dbms_tpr.create_snapshot;';

before
and:


tbsql system/manager <<<'exec dbms_tpr.create_snapshot;';
tbsql system/manager <<<'exec dbms_tpr.report_text_last;';

after.

Now running:


sh runit.sh 4 </dev/null

and I can see the 4 sessions near 100% in CPU.
Note that they are threads in Tibero, need to tun “top -H” to see the detail:

TPR (Tibero Performance Repository) Report 1 session LIO

Here is the report for 5 minutes of running on one session:


--------------------------------------------------------------------------------
               *** TPR (Tibero Performance Repository) Report ***
--------------------------------------------------------------------------------

              DB Name : t6a
                  TAC : NO
                  TSC : NO
         Instance Cnt : 1
              Release : 6   r166256 ( LINUX_X86_64 )
            Host CPUs : 48

   Interval condition : 758 (snapshot id) (#=1 reported)
Report Snapshot Range : 2020-05-26 20:02:37 ~ 2020-05-26 20:07:38
  Report Instance Cnt : 1 (from Instance NO. 'ALL')
         Elapsed Time : 5.02 (mins)
              DB Time : 4.99 (mins)
       Avg. Session # : 1.00

I’m running a simple installation. No TSC (Tibero Standby Cluster, the Active Data Guard equivalent) and no TAC (the Oracle RAC equivalent).
This report is a run with one session only (DB time = Elapsed time) and Avg. Session # is 1.00.


================================================================================
 2.1 Workload Summary
================================================================================

                            Per Second           Per TX         Per Exec         Per Call
                       ---------------  ---------------  ---------------  ---------------
          DB Time(s):             1.00             7.68             0.00            42.80
           Redo Size:         3,096.93        23,901.92             0.52       133,167.86
       Logical Reads:       394,911.05     3,047,903.26            66.40    16,981,175.29
       Block Changes:            26.60           205.33             0.00         1,144.00
      Physical Reads:             0.00             0.00             0.00             0.00
     Physical Writes:             1.31            10.08             0.00            56.14
          User Calls:             0.02             0.18             0.00             1.00
              Parses:             0.02             0.18             0.00             1.00
         Hard Parses:             0.00             0.00             0.00             0.00
              Logons:             0.01             0.05             0.00             0.29
            Executes:         5,947.03        45,898.85             1.00       255,722.14
           Rollbacks:             0.00             0.00             0.00             0.00
        Transactions:             0.13             1.00             0.00             5.57

394,911 logical reads per second is comparable to what I can get from Oracle on this Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz but let’s see what happens with some concurrency. I miss the DB CPU(s) which can quickly show that I am running mostly on CPU but this can be calculated from other parts of the report.(DB CPU time is 299,134 which covers the 5 minutes elapsed time). The “Per Call” is interesting as it has more meaning than the “Per Execution” one which counts the recursive executions.

4 sessions LIO

Another report from the run above with 4 concurrent sessions:


--------------------------------------------------------------------
               *** TPR (Tibero Performance Repository) Report ***
--------------------------------------------------------------------
DB Name : t6a
                  TAC : NO
                  TSC : NO
         Instance Cnt : 1
              Release : 6   r166256 ( LINUX_X86_64 )
            Host CPUs : 48
Interval condition : 756 (snapshot id) (#=1 reported)
Report Snapshot Range : 2020-05-2619:54:48 ~ 2020-05-2619:59:49
  Report Instance Cnt : 1 (from Instance NO. 'ALL')
         Elapsed Time : 5.02 (mins)
              DB Time : 19.85 (mins)
       Avg. Session # : 1.00

Ok, there’s a problem in the calculation of “Avg. Session #” which should be 19.85 / 5.02 = 3.95 for my 4 sessions.

About the Host:


================================================================================
 1.1 CPU Usage
================================================================================

                Total       B/G           Host CPU Usage(%)
               DB CPU    DB CPU  ----------------------------------
Instance#    Usage(%)  Usage(%)   Busy   User    Sys   Idle  IOwait
-----------  --------  --------  -----  -----  -----  -----  ------
          0       7.9       7.9    7.9    7.8    0.1   92.1     0.0

mostly idle as I have 38 vCPUs there.


====================================================================
 1.2 Memory Usage
====================================================================
HOST Mem Size :    322,174M
                      Total SHM Size :      3,072M
                   Buffer Cache Size :      2,048M
               Avg. Shared Pool Size :     288.22M
                  Avg. DD Cache Size :       8.22M
                  Avg. PP Cache Size :      63.44M
                       DB Block Size :          8K
                     Log Buffer Size :         10M

My workload size (80MB defined in slob.conf) is much smaller than the buffer cache.
and then I expect an average of 4 active sessions in CPU:


====================================================================
 2.1 Workload Summary
====================================================================
Per Second          Per TX        Per Call
                  --------------- --------------- ---------------
      DB Time(s):            3.96           38.41           79.38
       Redo Size:        3,066.07       29,770.52       61,525.73
   Logical Reads:      427,613.99    4,151,993.87    8,580,787.33
   Block Changes:           26.36          255.90          528.87
  Physical Reads:            0.00            0.00            0.00
 Physical Writes:            0.99            9.61           19.87
      User Calls:            0.05            0.48            1.00
          Parses:            0.03            0.29            0.60
     Hard Parses:            0.00            0.00            0.00
          Logons:            0.02            0.16            0.33
        Executes:       67,015.04      650,694.45    1,344,768.53
       Rollbacks:            0.00            0.00            0.00
    Transactions:            0.10            1.00            2.07

3.96 average session is not so bad. But 427,613 LIOPS is only a bit higher than the 1 session run, but now with 4 concurrent sessions. 4x CPU usage for only 10% higher throughput…

At this point I must say that I’m not doing a benchmark here. I’m using the same method as I do with Oracle, for which I know quite well how it works, but here Tibero is totally new for me. I’ve probably not configured it correctly and the test I’m doing may not be correct. I’m looking at the number here only to understand a bit more how it works.

The Time Model is much more detailed than Oracle one:


================================================================================
 2.2 Workload Stats
================================================================================

                                                                         DB         DB
            Category                                       Stat         Time(ms)   Time(%)
--------------------  -----------------------------------------  ---------------  --------
Request Service Time  -----------------------------------------        1,190,760    100.00
                                            SQL processing time        1,200,818    100.84
                           reply msg processing time for others                0      0.00
                                             commit by msg time                0      0.00
                              SQL processing (batchupdate) time                0      0.00
                                           rollback by msg time                0      0.00
                                                   msg lob time                0      0.00
                                                    msg xa time                0      0.00
                                                   msg dpl time                0      0.00
                                            msg dblink 2pc time                0      0.00
                                                  msg tsam time                0      0.00
                                             msg long read time                0      0.00
      SQL Processing  -----------------------------------------        1,200,818    100.84
                                       SQL execute elapsed time          369,799     31.06
                                          csr fetch select time          355,744     29.88
                                             parse time elapsed           83,992      7.05
                                       sql dd lock acquire time           25,045      2.10
                                                ppc search time            5,975      0.50
                                          csr fetch insert time              115      0.01
                                          csr fetch delete time               37      0.00
                                        hard parse elapsed time                2      0.00
                                        total times to begin tx                1      0.00
                                      failed parse elapsed time                1      0.00
                                      sql dml lock acquire time                0      0.00
                                              cursor close time                0      0.00
                        stat load query hard parse elapsed time                0      0.00
                        stat load query soft parse time elapsed                0      0.00
                              csr fetch direct path insert time                0      0.00
                                           csr fetch merge time                0      0.00
                                                 optimizer time                0      0.00
                                          csr fetch update time                0      0.00
                                 stat load query row fetch time                0      0.00
              Select  -----------------------------------------               17      0.00
                                           table full scan time               11      0.00
                                                 hash join time                6      0.00
                                                      sort time                0      0.00
                                        op_proxy execution time                0      0.00
              Insert  -----------------------------------------                1      0.00
                                              tdd mi total time                1      0.00
                                            tdi insert key time                0      0.00
                                            tdd insert row time                0      0.00
                                              tdi mi total time                0      0.00
                                            tdi fast build time                0      0.00
              Update  -----------------------------------------                0      0.00
                                             tdd update rp time                0      0.00
                                            tdd update row time                0      0.00
                                              tdd mu total time                0      0.00
                                    idx leaf update nonkey time                0      0.00
              Delete  -----------------------------------------               15      0.00
                                            tdd delete row time               15      0.00
                                             tdd delete rp time                0      0.00
                                              tdd md total time                0      0.00
                                              tdi md total time                0      0.00
                                            tdi delete key time                0      0.00

I am surprised to spend 7% of the time in parsing, as I expect the few queries to be parsed only once there, which is confirmed by the Workload Summary above.

Having a look at the ratios:


================================================================================
 3.1 Instance Efficiency
================================================================================

                              Value
                           --------
      Buffer Cache Hit %:    100.00
           Library Hit %:    100.00
                PP Hit %:     99.54
             Latch Hit %:     98.44
        Redo Alloc Hit %:    100.00
         Non-Parse CPU %:     92.95

confirms that 7% of the CPU time is about parsing.

We have many statistics. Here are the timed-base ones:


================================================================================
 6.1 Workload Stats (Time-based)
================================================================================

                                     Stat         Time(ms)    Avg. Time(ms)              Num             Size
-----------------------------------------  ---------------  ---------------  ---------------  ---------------
                      SQL processing time        1,200,818        120,081.8               10                0
                Inner SQL processing time        1,200,817        120,081.7               10                0
                         req service time        1,190,760         79,384.0               15                0
                              DB CPU time        1,120,908              1.6          705,270                0
                 SQL execute elapsed time          369,799              0.0       20,174,294                0
                    csr fetch select time          355,744              0.0       20,174,264                0
                         tscan rowid time          187,592              0.0       20,174,102                0
               PSM execution elapsed time          129,820              0.0       60,514,594                0
                       parse time elapsed           83,992              0.0       20,174,308                0
                     tdi fetch start time           47,440              0.0       20,175,956        1,660,966
                 sql dd lock acquire time           25,045              0.0       20,171,527                0
                 isgmt get cr in lvl time           25,004              0.0       20,390,418       22,268,417
                        isgmt get cr time           21,500              0.0       22,479,405                0
                tscan rowid pick exb time           18,734              0.0      106,200,851                0
                    tscan rowid sort time           15,529              0.0       20,173,326                0
                         ppc search time            5,975              0.0       20,174,325                0
                           dd search time            4,612              0.0       40,344,560                0
...
                  hard parse elapsed time                2              0.1               13                0
...
                failed parse elapsed time                1              0.1                5                0
...

This parse time is not hard parse.
The non-timed-based statistics are conveniently ordered by number which is more useful than by alphabetical order:


================================================================================
 6.2 Workload Stats (Number-based)
================================================================================

                                     Stat              Num             Size         Time(ms)
-----------------------------------------  ---------------  ---------------  ---------------
                     candidate bh scanned      128,715,142      153,638,765                0
               consistent gets - no clone      128,700,485                0                0
  fast examines for consistent block gets      128,700,414                0                0
                    consistent block gets      128,698,143                0                0
                 block pin - not conflict      128,691,338                0              796
                              block unpin      128,691,332                0              333
                      rowid sort prefetch      106,200,861       18,945,339                0
                     tscan rowid pick exb      106,200,851                0           18,734
                          dd search count       40,344,560                0            4,612
Number of conflict DBA while scanning can       24,917,669                0                0

...
                    tdi fetch start total       20,175,956        1,660,966           47,440
           parse count (for all sessions)       20,174,308                0           83,992
                         csr fetch select       20,174,264                0          355,744
                              tscan rowid       20,174,102                0          187,592
                         tscan rowid sort       20,173,326                0           15,529
               memory tuner prof register       20,171,661       20,171,661              198
                            execute count       20,171,528                0                0
                      sql dd lock acquire       20,171,527                0           25,045
...
                      parse count (total)                9                0                0
...

and this clearly means that I had as many parses as counts. Then I realized that there was nothing about a parse-to-execute ratio in the “Instance Efficiency” which is the only ratio I read at in Oracle “Instance Efficiency”. Even if the terms are similar there’s something different from Oracle.
The only documentation I’ve found is in Korean: https://technet.tmaxsoft.com/download.do?filePath=/nas/technet/technet/upload/kss/tdoc/tibero/2015/02/&fileName=FILE-20150227-000002_150227145000_1.pdf
According to this, “parse time elapsed” is the time spent in “parse count (for all sessions)”, and covers parsing, syntax and semantic analysis which is what we call soft parse in Oracle. “parse count (total)” is parsing, transformation and optimization, which is what we call hard parse in Oracle. With this very small knowledge, it looks like, even if called from PL/SQL, a soft parse occurred for each execution. Look also at the “dd’ statistics: “sql dd lock acquire time” is 2.1% of DB time and looks like serialization on the Data Dictionary. And, even if not taking lot of time (“dd search time” is low) the “dd search count” is called 2 times per execution: soft parse of the small select has to read the Data Dictionary definitions.

Of course we have wait events (not timed events as this section does not include the CPU):


================================================================================
 3.2 Top 5 Wait Events by Wait Time
================================================================================

                                                  Time          Wait          Avg           Waits       DB
                         Event            Waits  -outs(%)       Time(s)      Wait(ms)           /TX   Time(%)
------------------------------  ---------------  --------  ------------  ------------  ------------  --------
           spinlock total wait          787,684      0.00         45.47          0.06  2,540,916.13      3.82
          dbwr write time - OS                7      0.00          0.16         23.14         22.58      0.01
spinlock: cache buffers chains            9,837      0.00          0.12          0.01     31,732.26      0.01
               redo write time               36      0.00          0.03          0.78        116.13      0.00
     log flush wait for commit               31      0.00          0.03          0.81        100.00      0.00

Only “spinlock total wait” is significant here.
Here is the section about latches:


================================================================================
 7.7 Spinlock(Latch) Statistics
================================================================================

                                            Get    Get     Avg.Slps       Wait           Nowait   Nowait     DB
                          Name          Request   Miss(%)     /Miss       Time(s)       Request   Miss(%)   Time(%)
------------------------------  ---------------  --------  --------  ------------  ------------  --------  --------
         SPIN_SPIN_WAITER_LIST        2,190,429      4.07      3.60         51.69             0      0.00      4.34
                    SPIN_ALLOC      161,485,088     10.40      3.71         41.42             0      0.00      3.48
                SPIN_LC_BUCKET       60,825,921      0.53     25.55          3.31             0      0.00      0.28
               SPIN_ROOT_ALLOC      126,036,981      0.00     79.52          0.16   126,040,272      0.00      0.01
               SPIN_BUF_BUCKET      257,429,840      0.00     15.82          0.12            34      0.00      0.01
               SPIN_RECR_UNPIN      121,038,780      0.00     57.14          0.08        36,039      0.01      0.01
                 SPIN_SQLSTATS       40,383,027      0.36      0.61          0.07             0      0.00      0.01

This looks like very generic Latch protected by spinlock.

While I was there I ran a Brendan Gregg Flamegraph during the run:


perf script -i ./perf.data | FlameGraph/stackcollapse-perf.pl | FlameGraph/flamegraph.pl --width=1200 --hash --cp > /tmp/tibero-slob.svg

if you ever traced some Oracle PL/SQL call stacks, you can see that Tibero is completely different implementation. But the features are very similar. TPR is really like AWR. And there is also an ASH equivalent. You may have seen in the previous post that I have set ACTIVE_SESSION_HISTORY=Y in the .tip file (Tibero instance parameters). You can query a v$active_session_history.

SQL_TRACE

Given the similarity with Oracle, let’s do a good old ‘tkprof’.
I run the SLOB call with SQL_TRACE enabled:


[SID=t6a u@h:w]$ tbsql user1/user1

tbSQL 6

TmaxData Corporation Copyright (c) 2008-. All rights reserved.

Connected to Tibero.

SQL> alter session set sql_trace=y;

Session altered.

SQL> @slob 1 0 0 300 10240 64 LITE 0 FALSE 8 16 3 0 .1 .5 0

PSM completed.

SQL> alter session set sql_trace=y;

Session altered.

The raw trace has been generated, which contains things like this:


=========================================================
PARSING IN CSR=#8 len=87, uid=172, tim=1590498896169612
SELECT COUNT(c2)
                        FROM cf1
                        WHERE ( custid > ( :B1 - :B2 ) ) AND  (custid < :B1)
END OF STMT
[PARSE] CSR=#8 c=0, et=29, cr=0, cu=0, p=0, r=0, tim=1590498896169655
[EXEC] CSR=#8 c=0, et=48, cr=0, cu=0, p=0, r=0, tim=1590498896169720
[FETCH] CSR=#8 c=0, et=307, cr=66, cu=0, p=0, r=1, tim=1590498896170044
[PSM] OBJ_ID=-1, ACCESS_NO=0 c=0, et=14, cr=0, cu=0, p=0, r=0, tim=1590498896170090
[STAT] CSR=#8 ppid=4826 cnt_in_L=1 cnt=1 dep=0 'column projection (et=2, cr=0, cu=0, co=63, cpu=0, ro=1)'
[STAT] CSR=#8 ppid=4826 cnt_in_L=64 cnt=1 dep=1 'sort aggr (et=6, cr=0, cu=0, co=63, cpu=0, ro=1)'
[STAT] CSR=#8 ppid=4826 cnt_in_L=64 cnt=64 dep=2 'table access (rowid) CF1(3230) (et=258, cr=64, cu=0, co=63, cpu=0, ro=60)'
[STAT] CSR=#8 ppid=4826 cnt_in_L=0 cnt=64 dep=3 'index (range scan) I_CF1(3235) (et=20, cr=2, cu=0, co=2, cpu=0, ro=60)'
CSR_CLOSE #8
[PSM] OBJ_ID=-1, ACCESS_NO=0 c=0, et=62, cr=0, cu=0, p=0, r=0, tim=1590498896170170
=========================================================

And all this can be aggregated by TBPROF:

tbprof tb_sqltrc_17292_63_2107964.trc tb_sqltrc_17292_63_2107964.txt sys=yes sort="prscnt"

Which gets something very similar to Oracle tkprof:


SELECT COUNT(c2)
                        FROM cf1
                        WHERE ( custid > ( :B1 - :B2 ) ) AND  (custid < :B1)

 stage     count       cpu   elapsed   current     query      disk      rows
-----------------------------------------------------------------------------
 parse         1      0.00      0.00         0         0         0         0
  exec    829366     28.41     27.68         0         0         0         0
 fetch    829366    137.36    135.18         0  55057089         0    829366
-----------------------------------------------------------------------------
   sum   1658733    165.77    162.86         0  55057089         0    829366

    rows  u_rows        execution plan
  ----------------------------------------------------------
   829366       -       column projection (et=859685, cr=0, cu=0, co=52250058, cpu=0, ro=829366)
   829366       -        sort aggr (et=3450526, cr=0, cu=0, co=52250058, cpu=0, ro=829366)
   53079424       -       table access (rowid) CF1(3230) (et=106972426, cr=53079424, cu=0, co=52250058, cpu=0, ro=49761960)
   53079424       -        index (range scan) I_CF1(3235) (et=9256762, cr=1977665, cu=0, co=1658732, cpu=0, ro=49761960)

But here I’m puzzled. From the TPR statistics, I got the impression that each SQL in the PSM (Persistent Stored Procedure – the PL/SQL compatible procedural language) was soft parsed but I was wrong: I see only one parse call here. Is there something missing there? I don’t think so. The total time in this profile is 162.86 seconds during a 300 seconds run without any think time. Writing the trace file is not included there. if I compare the logical reads per second during the SQL time: 55057089/162.86=338064 I am near the value I got without sql_trace.

I leave it with unanswered questions about the parse statistics. The most important, which was the goal of this post, is that there’s lot of troubleshooting tools, similar to Oracle, and I was able to run something that was really specific to Oracle, with sqlplus, SQL, and PL/SQL without the need for many changes.

Cet article 티베로 – The AWR-like “Tibero Performance Repository” est apparu en premier sur Blog dbi services.

How to add storage on ODA X8-2M

Tue, 2020-05-26 06:30

Recently I had to add some storage on an ODA X8-2M that I deployed early February. At that time the last available release was ODA 18.7. In this post I would like to share my experience and the challenge I could face.

ODA X8-2M storage extension

As per Oracle datasheet we can see that we have initially 2 NVMe SSDs installed. With an usable capacity of 5.8 TB. We can extend up to 12 NVMe SSDs per slot of 2 disks, which can bring the ASM storage up to 29.7 TB as usable capacity.
In my configuration we were already having initally 4 NVME SSDs disk and we wanted to add 2 more.

Challenge

During the procedure to add the disk, I surprisingly could see that with release 18.7 the common expand storage command was not recognized.

[root@ODA01 ~]# odaadmcli expand storage -ndisk 2
Command 'odaadmcli expand storage' is not supported

What hell is going here? This was always possible on previous ODA generations and previous releases!
Looking closer to the documentation I could see the following note :
Note:In this release, you can add storage as per your requirement, or deploy the full storage capacity for Oracle Database Appliance X8-2HA and X8-2M hardware models at the time of initial deployment of the appliance. You can only utilize whatever storage you configured during the initial deployment of the appliance (before the initial system power ON and software provisioning and configuration). You cannot add additional storage after the initial deployment of the X8-2HA and X8-2M hardware models, in this release of Oracle Database Appliance, even if the expandable storage slots are present as empty.

Hmmm, 18.5 was still allowing it. Fortunately, the 18.8 version just got released at that time and post installation storage expansion is again possible with that release.
I, then, had to first patch my ODA with release 18.8. A good blog for ODA 18.8 patching from one of my colleague can be found here : Patching ODA from 18.3 to 18.8. Coming from 18.3, 18.5, or 18.7 would follow the same process.

Adding disks on the ODA Checking ASM usage

Let’s first check the current ASM usage :

grid@ODA01:/home/grid/ [+ASM1] asmcmd
 
ASMCMD> lsdg
State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED NORMAL N 512 512 4096 4194304 12211200 7550792 3052544 2248618 0 Y DATA/
MOUNTED NORMAL N 512 512 4096 4194304 12209152 2848956 3052032 -102044 0 N RECO/

Check state of the disk

Before adding a new disk, all current disks need to be healthy.

[root@ODA01 ~]# odaadmcli show disk
NAME PATH TYPE STATE STATE_DETAILS
 
pd_00 /dev/nvme0n1 NVD ONLINE Good
pd_01 /dev/nvme1n1 NVD ONLINE Good
pd_02 /dev/nvme3n1 NVD ONLINE Good
pd_03 /dev/nvme2n1 NVD ONLINE Good

We are using 2 ASM groups :
[root@ODA01 ~]# odaadmcli show diskgroup
DiskGroups
----------
DATA
RECO

Run orachk

It is recommended to run orachk and be sure the ODA is healthy before adding some new disk :

[root@ODA01 ~]# cd /opt/oracle.SupportTools/orachk/oracle.ahf/orachk
[root@ODA01 orachk]# ./orachk -nordbms

Physical disk installation

In my configuration I have already 4 disks. The 2 additional disks will then be installed in slot 4 and 5. After the disk is plugged in we need to power it on :

[root@ODA01 orachk]# odaadmcli power disk on pd_04
Disk 'pd_04' already powered on

It is recommended to wait at least one minute before plugging in the next disk. The LED of the disk should also shine green. Similarly we can power on the next disk once plugged in the slot 5 of the server :

[root@ODA01 orachk]# odaadmcli power disk on pd_05
Disk 'pd_05' already powered on

Expand the storage

Following command will be used to expand the storage with 2 new disks :
[root@ODA01 orachk]# odaadmcli expand storage -ndisk 2
Precheck passed.
Check the progress of expansion of storage by executing 'odaadmcli show disk'
Waiting for expansion to finish ...

Check expansion

At the beginning of the expansion, we can check and see that the 2 new disks have been seen and are in the process to be initialized :
[root@ODA01 ~]# odaadmcli show disk
NAME PATH TYPE STATE STATE_DETAILS
 
pd_00 /dev/nvme0n1 NVD ONLINE Good
pd_01 /dev/nvme1n1 NVD ONLINE Good
pd_02 /dev/nvme3n1 NVD ONLINE Good
pd_03 /dev/nvme2n1 NVD ONLINE Good
pd_04 /dev/nvme4n1 NVD UNINITIALIZED NewDiskInserted
pd_05 /dev/nvme5n1 NVD UNINITIALIZED NewDiskInserted

Once the expansion is finished, we can check that all our disk, including the new ones, are OK :
[root@ODA01 ~]# odaadmcli show disk
NAME PATH TYPE STATE STATE_DETAILS
 
pd_00 /dev/nvme0n1 NVD ONLINE Good
pd_01 /dev/nvme1n1 NVD ONLINE Good
pd_02 /dev/nvme3n1 NVD ONLINE Good
pd_03 /dev/nvme2n1 NVD ONLINE Good
pd_04 /dev/nvme4n1 NVD ONLINE Good
pd_05 /dev/nvme5n1 NVD ONLINE Good

We can also query the ASM instance and see that the 2 new disks in slot 4 and 5 are online :
SQL> col PATH format a50
SQL> set line 300
SQL> set pagesize 500
SQL> select mount_status, header_status, mode_status, state, name, path, label from v$ASM_DISK order by name;
 
MOUNT_S HEADER_STATU MODE_ST STATE NAME PATH LABEL
------- ------------ ------- -------- ------------------------------ -------------------------------------------------- -------------------------------
CACHED MEMBER ONLINE NORMAL NVD_S00_PHLN9440011FP1 AFD:NVD_S00_PHLN9440011FP1 NVD_S00_PHLN9440011FP1
CACHED MEMBER ONLINE NORMAL NVD_S00_PHLN9440011FP2 AFD:NVD_S00_PHLN9440011FP2 NVD_S00_PHLN9440011FP2
CACHED MEMBER ONLINE NORMAL NVD_S01_PHLN94410040P1 AFD:NVD_S01_PHLN94410040P1 NVD_S01_PHLN94410040P1
CACHED MEMBER ONLINE NORMAL NVD_S01_PHLN94410040P2 AFD:NVD_S01_PHLN94410040P2 NVD_S01_PHLN94410040P2
CACHED MEMBER ONLINE NORMAL NVD_S02_PHLN9490009MP1 AFD:NVD_S02_PHLN9490009MP1 NVD_S02_PHLN9490009MP1
CACHED MEMBER ONLINE NORMAL NVD_S02_PHLN9490009MP2 AFD:NVD_S02_PHLN9490009MP2 NVD_S02_PHLN9490009MP2
CACHED MEMBER ONLINE NORMAL NVD_S03_PHLN944000SQP1 AFD:NVD_S03_PHLN944000SQP1 NVD_S03_PHLN944000SQP1
CACHED MEMBER ONLINE NORMAL NVD_S03_PHLN944000SQP2 AFD:NVD_S03_PHLN944000SQP2 NVD_S03_PHLN944000SQP2
CACHED MEMBER ONLINE NORMAL NVD_S04_PHLN947101TZP1 AFD:NVD_S04_PHLN947101TZP1 NVD_S04_PHLN947101TZP1
CACHED MEMBER ONLINE NORMAL NVD_S04_PHLN947101TZP2 AFD:NVD_S04_PHLN947101TZP2 NVD_S04_PHLN947101TZP2
CACHED MEMBER ONLINE NORMAL NVD_S05_PHLN947100BXP1 AFD:NVD_S05_PHLN947100BXP1 NVD_S05_PHLN947100BXP1
CACHED MEMBER ONLINE NORMAL NVD_S05_PHLN947100BXP2 AFD:NVD_S05_PHLN947100BXP2 NVD_S05_PHLN947100BXP2

CACHED MEMBER ONLINE DROPPING SSD_QRMDSK_P1 AFD:SSD_QRMDSK_P1 SSD_QRMDSK_P1
CACHED MEMBER ONLINE DROPPING SSD_QRMDSK_P2 AFD:SSD_QRMDSK_P2 SSD_QRMDSK_P2
 
14 rows selected.

The operation system will recognize the disks as well :
grid@ODA01:/home/grid/ [+ASM1] cd /dev
 
grid@ODA01:/dev/ [+ASM1] ls -l nvme*
crw-rw---- 1 root root 246, 0 May 14 10:31 nvme0
brw-rw---- 1 grid asmadmin 259, 0 May 14 10:31 nvme0n1
brw-rw---- 1 grid asmadmin 259, 1 May 14 10:31 nvme0n1p1
brw-rw---- 1 grid asmadmin 259, 2 May 14 10:31 nvme0n1p2
crw-rw---- 1 root root 246, 1 May 14 10:31 nvme1
brw-rw---- 1 grid asmadmin 259, 5 May 14 10:31 nvme1n1
brw-rw---- 1 grid asmadmin 259, 10 May 14 10:31 nvme1n1p1
brw-rw---- 1 grid asmadmin 259, 11 May 14 14:38 nvme1n1p2
crw-rw---- 1 root root 246, 2 May 14 10:31 nvme2
brw-rw---- 1 grid asmadmin 259, 4 May 14 10:31 nvme2n1
brw-rw---- 1 grid asmadmin 259, 7 May 14 14:38 nvme2n1p1
brw-rw---- 1 grid asmadmin 259, 9 May 14 14:38 nvme2n1p2
crw-rw---- 1 root root 246, 3 May 14 10:31 nvme3
brw-rw---- 1 grid asmadmin 259, 3 May 14 10:31 nvme3n1
brw-rw---- 1 grid asmadmin 259, 6 May 14 10:31 nvme3n1p1
brw-rw---- 1 grid asmadmin 259, 8 May 14 10:31 nvme3n1p2
crw-rw---- 1 root root 246, 4 May 14 14:30 nvme4
brw-rw---- 1 grid asmadmin 259, 15 May 14 14:35 nvme4n1
brw-rw---- 1 grid asmadmin 259, 17 May 14 14:38 nvme4n1p1
brw-rw---- 1 grid asmadmin 259, 18 May 14 14:38 nvme4n1p2
crw-rw---- 1 root root 246, 5 May 14 14:31 nvme5
brw-rw---- 1 grid asmadmin 259, 16 May 14 14:35 nvme5n1
brw-rw---- 1 grid asmadmin 259, 19 May 14 14:38 nvme5n1p1
brw-rw---- 1 grid asmadmin 259, 20 May 14 14:38 nvme5n1p2

Check ASM space

Querying the ASM disk groups we can see that both Volumes have got additional space in relation of the corresponding pourcentage assigned to DATA and RECO disk group during appliance creation. In my case it was 50-50 for DATA and RECO repartition.

grid@ODA01:/dev/ [+ASM1] asmcmd
 
ASMCMD> lsdg
State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED NORMAL Y 512 512 4096 4194304 18316288 13655792 3052544 5301118 0 Y DATA/
MOUNTED NORMAL Y 512 512 4096 4194304 18313216 8952932 3052032 2949944 0 N RECO/
ASMCMD>

Conclusion

Adding some new disks on an ODA is quite easy and fast. Surprisingly with ODA release 18.7 you are not able to expand ASM storage once the appliance is installed. This is really a regression where you will lose the ability to extend the storage. Fortunately, this has been solved in ODA version 18.8.

Cet article How to add storage on ODA X8-2M est apparu en premier sur Blog dbi services.

6 things Oracle could do for a better ODA

Tue, 2020-05-26 04:44
Introduction

With the latest ODA X8 range, at least 80% of the customers could find an ODA configuration that fits their needs. For the others, either they can’t afford it, either they are already in the Cloud, or they need extremely large storage or EXADATA performance. Among these 80% of customers, only a few choose ODA. Let’s see how Oracle could improve the ODA to make it a must.

1) Make robust and reliable releases

This is the main point. ODA is built on Linux, Grid Infrastructure and database software, nearly identical to what you can find on a classic linux server. But it comes bundled and with odacli, a central CLI to manage deployment, database creations, updates, migrations and so on. And it sometimes has annoying bugs. More reliable releases could also make patching less tricky, and customers much more confident in this kind of operation.

It could also be nice to have long-term releases on ODA, like on the database. One long-term release each year, with only bug fixes and deeply tested, for those customers who prefer stability and reliability: most of them.

2) Make a real GUI

An appliance is something that eases your life. You unpack the server, you plug it, you press the power button, and you start configuring it with a smart GUI. ODA is not yet that nice. GUI is quite basic, and most of us use the CLI to have a complete control over all the features. So please Oracle, make the GUI a real strength of the ODA, with a pinch of Cloud Control features but without the complexity of Cloud Control. That would be a game-changer.

3) Integrate Data Guard management

Data Guard works fine on ODA, but you’ll have to setup the configuration yourself. Most of the customers plan to use Data Guard if they are using Enterprise Edition. And actually, ODA doesn’t know about Data Guard. You’ll need to configure everything like if it were a standard server. In my dreams, an ODA could be paired up with another one, and standby databases automatically created on the paired ODA, duplicated and synchronized with the primaries. Later we could easily switchover and switchback from the GUI, without any specific knowledge.

There is a lot of work to achieve this, but it could be a killer feature.

4) Get rid of GI for ODA lites

Yes, Grid Infrastructure adds complexity. And a “lite” appliance means simplified appliance. GI is needed mainly because we need ASM redundancy, and ASM is really nice. It’s actually better than RAID. But do you remember how ASM was configured in 10g? Just by deploying a standalone DBhome and creating a pfile with instance_type=ASM. That’s it. No dependencies between ASM and the other DBHomes. Once ASM is on the server, each instance can use it. And it could make patching easier for sure.

5) Make IPs modifiables

Because sometimes you would need to change the public IP address of an ODA, or its name. Moving to another datacenter is a good example. For now, changing IPs is only possible when appliance is not yet deployed, meaning unused. You can eventually change the network configuration manually, but don’t consider future patches will work. An easy function to change the network configuration on a deployed ODA would be welcome.

6) Be proud of this product!

Last but not least. Yes, Cloud is the future. And Oracle Cloud Infrastructure is a great piece of Cloud. But it will take time for customers to migrate to the Cloud. Some of them are even not considering Cloud at all for the moment. They want on-premise solutions. ODA is THE solution that perfectly fits between OCI and EXADATA. It’s a great product, it’s worth the money and it has many years to live. To promote these appliances, maybe Oracle could make ODA better integrated with OCI, as a cross-technology solution. Being able to backup and restore the ODA configuration to the Cloud, to put a standby database in OCI from the GUI, to duplicate a complete environment to the Cloud for testing purpose, …

Conclusion

ODA is a serious product, but it still needs several improvements to amplify its popularity.

Cet article 6 things Oracle could do for a better ODA est apparu en premier sur Blog dbi services.

Automate AWS deployments with Ansible + Terraform

Tue, 2020-05-26 01:40
Automate AWS deployments with Ansible + Terraform

This installation is made from a bastion server already available with the proper network permissions.
For different deployment types, you should adapt it to your need.

Install requirements

Ansible installation:

sudo apt update
sudo apt install -y software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible

ref: https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html#installing-ansible-on-ubuntu

Terraform installation:

wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip
sudo unzip -d /usr/local/bin/ ./terraform_0.12.24_linux_amd64.zip

ref: https://www.techrepublic.com/article/how-to-install-terraform-on-ubuntu-server/

AWS Client installation:

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

ref: https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-linux.html

Connect your environment to your AWS cloud
$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: eu-central-a
Default output format [None]: json

ref: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#cli-quick-configuration

Create a new Ansible project

That is my favorite Ansible layout. The one I’ve seen as the best so far:

mkdir -p ~/my_aws_project/inventory
mkdir -p ~/my_aws_project/playbooks
mkdir -p ~/my_aws_project/roles
Add the EC2 dynamic inventory

Ansible can work with a different kind of inventory called the dynamic inventory.
Instead of having a static declaration in a static file of your inventory, you can generate it from a source.
This source can be a database, an active directory, etc. A dynamic inventory is a scrip that outputs a JSON in a structure that Ansible can handle. We could then develop a script that discovers our EC2 infrastructure that would take some time. Or we can use the one already provide with Ansible:

Install prerequisites:

sudo apt install -y python3-pip
sudo pip3 install boto
sudo pip3 install ansible
sudo rm /usr/bin/python
sudo ln -s /usr/bin/python3 /usr/bin/python

Get the EC2 dynamic inventory:

wget -O ~/my_aws_project/inventory/ec2.py \
https://raw.githubusercontent.com/ansible/ansible/stable-2.9/contrib/inventory/ec2.py
wget -O ~/my_aws_project/inventory/ec2.ini \
https://raw.githubusercontent.com/ansible/ansible/stable-2.9/contrib/inventory/ec2.ini
chmod +x ~/my_aws_project/inventory/ec2.py

There are multiple configuration options you can do with the ini file. For this blog I’ll change those vars:

regions = eu-central-1
vpc_destination_variable = private_ip_address

Test the inventory script:

~/my_aws_project/inventory/ec2.py ## ---> return JSON description of your AWS infrastructure

Because I want to work on one AWS region in the private network only. Since my bastion is already in
the AWS infrastructure.

Add a role for our deployment

I’ll create a role with the only purpose to deploy my infrastructure into AWS.

ansible-galaxy init --init-path ~/my_aws_project/roles ec2_instances_dep
Cable the components (Ansible configuration)

To have that layout working fine and the simpliest way, I use that configuration:

## file ~/my_aws_project/ansible.cfg
[defaults]
roles_path = ./roles
inventory  = ./inventory/ec2.py

Test the ansible inventory:

cd ~/my_aws_project
ansible-inventory --graph ## ---> return the Ansible interpreted inventory
Terraform with Ansible

When I need to do something with Ansible, I first check in the list of modules is the work is already done.
And, nicely, there is a module for Terraform.

So I can add this module in my task main file of my role:

## ~/my_aws_project/roles/ec2_instances_dep/tasks/main.yml
---
- name: I create a directory to store my Terraform config
  file:
    path: "~/aws_terraform"
    state: directory
    recurse: yes

- name: I copy my Terraform template into the working directory create above
  template:
    src: "my_ec2_infra.tf"
    dest: "~/aws_terraform/my_ec2_infra.tf"

- name: I deploy my configuration into AWS from Ansible
  terraform:
    project_path: "~/aws_terraform"
    force_init: true
    state: "present"
  register: r_aws

- name: I do whatever I need to do in my EC2 infrastructure
  debug: msg="update, install, create user, start services, etc"

- name: I destroy my AWS infrastructure 
  terraform:
    project_path: "~/aws_terraform"
    state: "absent"
Terraform content

Add this file into the template directory: ~/my_aws_project/roles/ec2_instances_dep

## file ~/my_aws_project/roles/ec2_instances_dep/my_ec2_infra.tf
provider "aws" {
  region = "eu-central-1"
}

resource "aws_instance" "dba-essential" {
  count                       = "5"
  ami                         = "ami-0e342d72b12109f91"
  availability_zone           = "eu-central-1a"
  instance_type               = "t2.micro"
  associate_public_ip_address = false
  security_groups             = ["my_sg_01"]
  vpc_security_group_ids      = ["sg-602eff2724d52a0b7"]
  key_name                    = "my_key_01"

  root_block_device {
    delete_on_termination = true
    encrypted             = false
    volume_size           = 15
    volume_type           = "gp2"
  }

  tags = {
    Owner           = "Nicolas"
    Name            = "crash-test-${count.index + 1}"
  }

}
Create a playbook to call the role
## file ~/my_aws_project/playbooks/deploy.yml
---
- name: Deploy my infrastructure
  hosts: localhost
  roles:

    - ec2_instances_dep
Run the playbook
cd my_aws_project
ansible-playbook playbooks/deploy.yml

Boom! Here it is. Now imagine that you can generate a unique key and unique directory for each deployment and you can deploy as much infrastructure as your credit card will accept it.

I hope this helps, and please comment below for any questions.

Cet article Automate AWS deployments with Ansible + Terraform est apparu en premier sur Blog dbi services.

Issue deleting a database on ODA?

Mon, 2020-05-25 15:33

I have recently faced an issue deleting database on an ODA. I was getting following error whatever database I wanted to delete : DCS-10001:Internal error encountered: null.

Through this blog, I would like to share with you my experience on this case hoping it will help you if you are facing same problem. On this project I was using ODA Release 18.5 and 18.8 and faced the same problem on both versions. On 18.3 and previous releases this was not the case.

Deleting the database

With odacli I tried to delete my TEST database, running following commands :

[root@ODA01 bin]# odacli delete-database -in TEST -fd
{
"jobId" : "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed",
"status" : "Running",
"message" : null,
"reports" : [ {
"taskId" : "TaskZJsonRpcExt_858",
"taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion",
"taskResult" : "",
"startTime" : "May 06, 2020 11:36:38 AM CEST",
"endTime" : "May 06, 2020 11:36:38 AM CEST",
"status" : "Success",
"taskDescription" : null,
"parentTaskId" : "TaskSequential_856",
"jobId" : "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed",
"tags" : [ ],
"reportLevel" : "Info",
"updatedTime" : "May 06, 2020 11:36:38 AM CEST"
} ],
"createTimestamp" : "May 06, 2020 11:36:38 AM CEST",
"resourceList" : [ ],
"description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51",
"updatedTime" : "May 06, 2020 11:36:38 AM CEST"
}

The job was failing with DCS-10001 Error :

[root@ODA01 bin]# odacli describe-job -i "bcdcbf59-0fe6-44b7-af7f-91f68c7697ed"
 
Job details
----------------------------------------------------------------
ID: bcdcbf59-0fe6-44b7-af7f-91f68c7697ed
Description: Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51
Status: Failure
Created: May 6, 2020 11:36:38 AM CEST
Message: DCS-10001:Internal error encountered: null.
 
Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:50 AM CEST Failure
database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:50 AM CEST Failure
Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion May 6, 2020 11:36:38 AM CEST May 6, 2020 11:36:38 AM CEST Success
Database Deletion May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success
Unregister Db From Cluster May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success
Kill Pmon Process May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:39 AM CEST Success
Database Files Deletion May 6, 2020 11:36:39 AM CEST May 6, 2020 11:36:40 AM CEST Success
Deleting Volume May 6, 2020 11:36:47 AM CEST May 6, 2020 11:36:50 AM CEST Success
database Service deletion for d6542252-dfa4-47f9-9cfc-22b4f0575c51 May 6, 2020 11:36:50 AM CEST May 6, 2020 11:36:50 AM CEST Failure

Troubleshooting

In the dcs-agent.log, located in /opt/oracle/dcs/log folder, you might see following errors :

2019-11-27 13:54:30,106 ERROR [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.c.j.JsonRequestProcessor: RPC request invocation failed on request: {"classz":"com.oracle.dcs.agent.rpc.service.dataguard.DataguardActions","method":"deleteListenerEntry","params":[{"type":"com.oracle.dcs.agent.model.DB","value":{"updatedTime":1573023492194,"id":"89e11f5d-9789-44a3-a09d-2444f0fda99e","name":"TEST","createTime":1573023439244,"state":{"status":"CONFIGURED"},"dbName":"TEST","databaseUniqueName":"TEST_RZB","dbVersion":"11.2.0.4.190115","dbHomeId":"c58cdcfd-e5b2-4041-b993-8df5a5d5ada4","dbId":null,"isCdb":false,"pdBName":null,"pdbAdminUserName":null,"enableTDE":false,"isBcfgInSync":null,"dbType":"SI","dbTargetNodeNumber":"0","dbClass":"OLTP","dbShape":"odb1","dbStorage":"ACFS","dbOnFlashStorage":false,"level0BackupDay":"sunday","instanceOnly":true,"registerOnly":false,"rmanBkupPassword":null,"dbEdition":"SE","dbDomainName":"ksbl.local","dbRedundancy":null,"dbCharacterSet":{"characterSet":"AL32UTF8","nlsCharacterset":"AL16UTF16","dbTerritory":"AMERICA","dbLanguage":"AMERICAN"},"dbConsoleEnable":false,"backupDestination":"NONE","cloudStorageContainer":null,"backupConfigId":null,"isAutoBackupDisabled":false}}],"revertable":false,"threadId":111}
! java.lang.NullPointerException: null
! at com.oracle.dcs.agent.rpc.service.dataguard.DataguardOperations.deleteListenerEntry(DataguardOperations.java:2258)
! at com.oracle.dcs.agent.rpc.service.dataguard.DataguardActions.deleteListenerEntry(DataguardActions.java:24)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.invokeRequest(JsonRequestProcessor.java:33)
! ... 23 common frames omitted
! Causing: com.oracle.dcs.commons.exception.DcsException: DCS-10001:Internal error encountered: null.
! at com.oracle.dcs.commons.exception.DcsException$Builder.build(DcsException.java:68)
! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.invokeRequest(JsonRequestProcessor.java:45)
! at com.oracle.dcs.commons.jrpc.JsonRequestProcessor.process(JsonRequestProcessor.java:74)
! at com.oracle.dcs.agent.task.TaskZJsonRpcExt.callInternal(TaskZJsonRpcExt.java:65)
! at com.oracle.dcs.agent.task.TaskZJsonRpc.call(TaskZJsonRpc.java:182)
! at com.oracle.dcs.agent.task.TaskZJsonRpc.call(TaskZJsonRpc.java:26)
! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82)
! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37)
! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39)
! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:10)
! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82)
! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37)
! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39)
! at com.oracle.dcs.agent.task.TaskZLockWrapper.call(TaskZLockWrapper.java:64)
! at com.oracle.dcs.agent.task.TaskZLockWrapper.call(TaskZLockWrapper.java:21)
! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82)
! at com.oracle.dcs.commons.task.TaskApi.call(TaskApi.java:37)
! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:39)
! at com.oracle.dcs.commons.task.TaskSequential.call(TaskSequential.java:10)
! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:82)
! at com.oracle.dcs.commons.task.TaskWrapper.call(TaskWrapper.java:17)
! at java.util.concurrent.FutureTask.run(FutureTask.java:266)
! at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
! at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
! at java.lang.Thread.run(Thread.java:748)
2019-11-27 13:54:30,106 INFO [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.a.z.DCSZooKeeper: DCS node id is - node_0
2019-11-27 13:54:30,106 DEBUG [database Service deletion for 89e11f5d-9789-44a3-a09d-2444f0fda99e : JobId=05a2d017-9b64-4e92-a7df-3ded603d0644] [] c.o.d.a.t.TaskZJsonRpc: Task[TaskZJsonRpcExt_124] RPC request 'Local:node_0@deleteListenerEntry()' completed: Failure

The key error to note would be : Local:node_0@deleteListenerEntry()’ completed: Failure

Explaination

This problem comes from the fact that the listener.ora file has been customized. As per Oracle Support, on an ODA, the listener.ora should never be customized and default listener.ora file should be used. I still have a SR opened with Oracle Support to clarify the situation as I’m fully convinced that this is a regression :

  1. It was always possible in previous ODA versions to delete a database with a customized listener file
  2. We need to customize the listener when setting Data Guard on Oracle 11.2.0.4 Version (still supported on ODA)
  3. We need to customize the listener when doing duplication as dynamic registration is not possible when the database is in nomount state and database is restarted during the duplication.

Moreover other ODA documentations are still referring customization of the listener.ora file when using ODA :
White paper : STEPS TO MIGRATE NON-CDB DATABASES TO ACFS ON ORACLE DATABASEAPPLIANCE 12.1.2
Deploying Oracle Data Guard with Oracle Database Appliance – A WhitePaper (2016-7) (Doc ID 2392307.1)

I will update the post as soon as I have some feedback from Oracle support on this.

The workaround would be to set back the default listener.ora file time of the deletion, which would request a maintenance windows for some customer.

Solution/Workaround Backup of the current listener configuration

OK, so let’s backup our current listener configuration first :

grid@ODA01:/home/grid/ [+ASM1] cd $TNS_ADMIN
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cp -p listener.ora ./history/listener.ora.20200506

Default ODA listener configuration

The backup of the default listener configuration is the following one :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cat listener19071611AM2747.bak
LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))) # line added by Agent
ASMNET1LSNR_ASM=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=ASMNET1LSNR_ASM)))) # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_ASMNET1LSNR_ASM=ON # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_ASMNET1LSNR_ASM=SUBNET # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON # line added by Agent
VALID_NODE_CHECKING_REGISTRATION_LISTENER=SUBNET # line added by Agent

Stopping the listener

Let’s stop the listener :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl stop listener -listener listener
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener
Listener LISTENER is enabled
Listener LISTENER is not running

Put default listener configuration

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] mv listener.ora listener.ora.before_db_del_20200506
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] cp -p listener19071611AM2747.bak listener.ora

Start the listener

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl start listener -listener listener
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): oda01

Delete database

We will try to delete the database again by running the same odacli command :

[root@ODA01 bin]# odacli delete-database -in TEST -fd
{
"jobId" : "5655be19-e0fe-4452-b8a9-35382c67bf96",
"status" : "Running",
"message" : null,
"reports" : [ {
"taskId" : "TaskZJsonRpcExt_1167",
"taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion",
"taskResult" : "",
"startTime" : "May 06, 2020 11:45:01 AM CEST",
"endTime" : "May 06, 2020 11:45:01 AM CEST",
"status" : "Success",
"taskDescription" : null,
"parentTaskId" : "TaskSequential_1165",
"jobId" : "5655be19-e0fe-4452-b8a9-35382c67bf96",
"tags" : [ ],
"reportLevel" : "Info",
"updatedTime" : "May 06, 2020 11:45:01 AM CEST"
} ],
"createTimestamp" : "May 06, 2020 11:45:01 AM CEST",
"resourceList" : [ ],
"description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51",
"updatedTime" : "May 06, 2020 11:45:01 AM CEST"
}

Unfortunately the deletion will fail with another error : DCS-10011:Input parameter ‘ACFS Device for delete’ cannot be NULL.

This is due to the fact that previous deletion has already removed the corresponding ACFS volume for the database (DATA and REDO). We will have to create them manually again. I have already described this solution in a previous post : Database deletion stuck in deleting-status.

After restoring the corresponding ACFS Volume, we can retry our database deletion again :

[root@ODA01 bin]# odacli delete-database -in TEST -fd
{
"jobId" : "5e227755-478b-46c5-a5cd-36687cb21ed8",
"status" : "Running",
"message" : null,
"reports" : [ {
"taskId" : "TaskZJsonRpcExt_1443",
"taskName" : "Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion",
"taskResult" : "",
"startTime" : "May 06, 2020 11:47:53 AM CEST",
"endTime" : "May 06, 2020 11:47:53 AM CEST",
"status" : "Success",
"taskDescription" : null,
"parentTaskId" : "TaskSequential_1441",
"jobId" : "5e227755-478b-46c5-a5cd-36687cb21ed8",
"tags" : [ ],
"reportLevel" : "Info",
"updatedTime" : "May 06, 2020 11:47:53 AM CEST"
} ],
"createTimestamp" : "May 06, 2020 11:47:53 AM CEST",
"resourceList" : [ ],
"description" : "Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51",
"updatedTime" : "May 06, 2020 11:47:53 AM CEST"
}

Which this time will be successful :

[root@ODA01 bin]# odacli describe-job -i "5e227755-478b-46c5-a5cd-36687cb21ed8"
 
Job details
----------------------------------------------------------------
ID: 5e227755-478b-46c5-a5cd-36687cb21ed8
Description: Database service deletion with db name: TEST with id : d6542252-dfa4-47f9-9cfc-22b4f0575c51
Status: Success
Created: May 6, 2020 11:47:53 AM CEST
Message:
 
Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Validate db d6542252-dfa4-47f9-9cfc-22b4f0575c51 for deletion May 6, 2020 11:47:53 AM CEST May 6, 2020 11:47:53 AM CEST Success
Database Deletion May 6, 2020 11:47:53 AM CEST May 6, 2020 11:47:54 AM CEST Success
Unregister Db From Cluster May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success
Kill Pmon Process May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success
Database Files Deletion May 6, 2020 11:47:54 AM CEST May 6, 2020 11:47:54 AM CEST Success
Deleting Volume May 6, 2020 11:48:01 AM CEST May 6, 2020 11:48:05 AM CEST Success
Delete File Groups of Database TEST May 6, 2020 11:48:05 AM CEST May 6, 2020 11:48:05 AM CEST Success

Restore our customized listener configuration

We can now restore our customized configuration as follows :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl stop listener -listener listener
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener
Listener LISTENER is enabled
Listener LISTENER is not running
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] mv listener.ora.before_db_del_20200506 listener.ora
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl start listener -listener listener
 
grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] srvctl status listener -listener listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): oda01

We could also confirm that the listener started successfully by displaying the tnslsnr running processes :

grid@ODA01:/u01/app/18.0.0.0/grid/network/admin/ [+ASM1] ps -ef | grep tnslsnr | grep -v grep
grid 14922 1 0 10:52 ? 00:00:00 /u01/app/18.0.0.0/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit
grid 97812 1 0 12:07 ? 00:00:00 /u01/app/18.0.0.0/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit

Conclusion

Starting ODA Release 18.5, database deletion will fail if the listener has been customized. Workaround is to to restore the listener default configuration for executing the deletion. This might imply for some customers to have a maintenance windows.

Cet article Issue deleting a database on ODA? est apparu en premier sur Blog dbi services.

How to use DBMS_SCHEDULER to improve performance ?

Sun, 2020-05-24 18:28

From an application point of view, the oracle scheduler DBMS_SCHEDULER allows to reach best performance by parallelizing your process.

Let’s start with the following PL/SQL code inserting in serial several rows from a metadata table to a target table. In my example, the metadata table does not contain “directly” the data but a set a of sql statement to be executed and for which the rows returned must be inserted into the target table My_Target_Table_Serial :

Let’s verify the contents of the source table called My_Metadata_Table:

SQL> SELECT priority,dwh_id, amq_name, sql_statement,scope from dwh_amq_v2;
ROWNUM  DWH_ID  AMQ_NAME SQL_STATEMENT          SCOPE
1	7	AAA1	 SELECT SUM(P.age pt.p	TYPE1
2	28	BBB2  	 SELECT CASE WHEN pt.p	TYPE1
3	37	CCC3	 "select cm.case_id fr"	TYPE2
4	48	DDD4	 "select cm.case_id fr"	TYPE2
5	73	EEE5	 SELECT DISTINCT pt.p	TYPE1
6	90	FFF6 	 SELECT LAG(ORW pt.p	TYPE1
7	114	GGG7	 SELECT distinct pt.	TYPE1
8	125	HHH8	 SELECT DISTINCT pt.p	TYPE1
...
148    115     ZZZ48    SELECT ROUND(TO_NUMBER TYPE2

Now let’s check the PL/SQL program :

DECLARE
  l_errm VARCHAR2(200);
  l_sql  VARCHAR2(32767) := NULL;
  sql_statement_1  VARCHAR2(32767) := NULL;
  sql_statement_2  VARCHAR2(32767) := NULL;
  l_amq_name VARCHAR2(200);
  l_date NUMBER;
BEGIN
  SELECT TO_NUMBER(TO_CHAR(SYSDATE,'YYYYMMDDHH24MISS')) INTO l_date FROM dual;
  FOR rec IN (SELECT dwh_id, amq_name, sql_statement,scope 
                FROM My_Metadata_Table,
                     (SELECT dwh_pit_date FROM dwh_code_mv) pt
               WHERE dwh_status = 1
                 AND (pt.dwh_pit_date >= dwh_valid_from AND pt.dwh_pit_date < dwh_valid_to) 
               ORDER BY priority, dwh_id) LOOP
    ...
    sql_statement_1 := substr(rec.sql_statement, 1, 32000);
    sql_statement_2 := substr(rec.sql_statement, 32001);
    IF rec.SCOPE = 'TYPE1' THEN 
      -- TYPE1 LEVEL SELECT
      l_sql := 'INSERT /*+ APPEND */ INTO My_Target_Table_Serial (dwh_pit_date, AMQ_ID, AMQ_TEXT, CASE_ID, ENTERPRISE_ID)'||CHR(13)|| 'SELECT DISTINCT TO_DATE(code.dwh_pit_date, ''YYYYMMDDHH24MISS''),'||rec.dwh_id|| ',''' ||rec.amq_name ||''', case_id, 1'||CHR(13)
      || ' FROM (SELECT dwh_pit_date FROM dwh_code) code, ('||sql_statement_1;
      EXECUTE IMMEDIATE l_sql || sql_statement_2 || ')';
      COMMIT;    
    ELSE 
      -- TYPE2 LEVEL SELECT
      l_sql :=  'INSERT /*+ APPEND */ INTO My_Target_Table_Serial (dwh_pit_date, AMQ_ID, AMQ_TEXT, CASE_ID, ENTERPRISE_ID)
      SELECT DISTINCT TO_DATE(code.dwh_pit_date, ''YYYYMMDDHH24MISS''), '||rec.dwh_id|| ',''' ||rec.amq_name || ''', cm.case_id, cm.enterprise_id'||CHR(13)
      || '  FROM (SELECT dwh_pit_date FROM dwh_code) code, v_sc_case_master cm, v_sc_case_event ce, ('||sql_statement_1;
              
      EXECUTE IMMEDIATE l_sql || sql_statement_2 || ') pt'||CHR(13)
      || ' WHERE cm.case_id = ce.case_id'||CHR(13) 
      || '   AND cm.deleted IS NULL AND cm.state_id <> 1'||CHR(13)
      || '   AND ce.deleted IS NULL AND ce.pref_term = pt.pt_name';
      COMMIT;         
    END IF;
    ...
   END LOOP:
END;
Number of Rows Read : 148 (Means 148 Sql Statement to execute)
START : 16:17:46
END : 16:57:42
Total :  40 mins

 

As we can see, each Sql Statement is executed in serial, let’s check the audit table recording the loading time (Insert Time) and the “scheduling”   :

CREATE_DATE		NAME	START_DATE		END_DATE            LOADING_TIME
22.05.2020 16:46:34	AAA1	22.05.2020 16:46:34	22.05.2020 16:57:42    11.08mins
22.05.2020 16:42:05	BBB2	22.05.2020 16:42:05	22.05.2020 16:46:34    04.29mins
22.05.2020 16:41:15	CCC3	22.05.2020 16:41:15	22.05.2020 16:42:05    50sec
22.05.2020 16:40:42	DDD4	22.05.2020 16:40:42	22.05.2020 16:41:15    32sec
22.05.2020 16:40:20	EEE5	22.05.2020 16:40:20	22.05.2020 16:40:42    22sec
22.05.2020 16:37:23	FFF6	22.05.2020 16:37:23	22.05.2020 16:40:20    02.57mins
22.05.2020 16:37:12	GGG7	22.05.2020 16:37:12	22.05.2020 16:37:23    11sec
...
22.05.2020 16:36:03	ZZZ148	22.05.2020 16:17:35	22.05.2020 16:17:46    11sec

To resume :

  • The 148 rows (148 Sql Statement) coming from the source table are loaded in serial in 40mins.
  • The majority of rows have taken less than 01 min to load (Ex. : Name = CCC3,DDD4,EEE5,GGG7 and ZZZ148)
  • Few rows have taken more than a couple of minutes to load.
  • The maximum loading time is 11.08mins for the Name “AA1”.
  • Each row must wait the previous row complete his loading before to start his loading (compare END_DATE previous vs START_DATE current).

To optimize the process, let’s trying to load all the rows coming from the source table in parallel by using the oracle scheduler DBMS_SCHEDULER.

Instead to execute directly the Insert command in the loop, let’s create a job through DBMS_SCHEDULER:

FOR rec IN (SELECT priority,dwh_id, amq_name, sql_statement,scope 
                FROM My_Metadata_Table,
                     (SELECT dwh_pit_date FROM dwh_code_mv) pt
               WHERE dwh_status = 1
                 AND (pt.dwh_pit_date >= dwh_valid_from AND pt.dwh_pit_date < dwh_valid_to) 
               ORDER BY priority, dwh_id) LOOP

     l_amq_name := rec.amq_name;
       IF rec.SCOPE = 'TYPE1' THEN 
        -- TYPE1 LEVEL SELECT
         ...
  
            --Execute Job to insert the AMQ : Background process
            DBMS_SCHEDULER.CREATE_JOB (
            job_name             => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
            job_type             => 'PLSQL_BLOCK',
            job_action           => 'BEGIN
                                      LOAD_DATA(''CASE'','||''''||l_amq_name||''''||','||rec.priority||','||l_date||','||v_SESSION_ID||','||i||');
                                     END;',
            start_date    =>  sysdate,  
            enabled       =>  TRUE,  
            auto_drop     =>  TRUE,  
            comments      =>  'job for amq '||l_amq_name);
          END IF;
        ELSE 
            ...
            END IF;
        END IF; 
      i := i +1;
  END LOOP;
Number of Rows Read : 148 (Means 148 Sql Statement to execute)
START : 08:14:03
END : 08:42:32
Total :  27.57 mins

To resume :

  • The 148 rows (148 Sql Statement) coming from the source table are loaded now in parallel in 27.57mins instead of 40mins in serial.
  • The options of DBMS_SCHEDULER are  :
    • As we are limited in number of character for the parameter “job_action”, we have to insert the data through a PL/SQL procedure LOAD_DATA.
    • The job is executed immediately (start_date=sysdate) and purged immediately after his execution (auto_drop=TRUE).

Let’s check now how the jobs are scheduled. Since we do a loop of 148 times, I expect to have 148 jobs:

First, let’s check now if the rows (Remember, One Row = One Insert Into Target Table From Source Table) are loaded in parallel :

CREATE_DATE 	    NAME START_DATE 	        END_DATE 				       
22.05.2020 16:46:34 AAA1 23.05.2020 08:14:04	23.05.2020 08:21:19
22.05.2020 16:42:05 BBB2 23.05.2020 08:14:04	23.05.2020 08:20:43
22.05.2020 16:41:15 CCC3 23.05.2020 08:14:04	23.05.2020 08:21:59
22.05.2020 16:40:42 DDD4 23.05.2020 08:14:03	23.05.2020 08:15:29
22.05.2020 16:40:20 EEE5 23.05.2020 08:14:03	23.05.2020 08:15:05
22.05.2020 16:37:23 FFF6 23.05.2020 08:14:03	23.05.2020 08:14:47
22.05.2020 16:37:12 GGG7 23.05.2020 08:14:03	23.05.2020 08:15:59
...                     
22.05.2020 16:36:03 ZZZ148 22.05.2020 16:17:35 22.05.2020 16:17:46

This is the case, all rows have the same start_date, meaning all rows start in parallel. Let’s verify into “all_scheduler_job_run_details” to check we have our 148 jobs in parallel :

SQL> select count(*) from all_scheduler_job_run_details where job_name like '%20200523081403';

  COUNT(*)
----------
       148
SQL> select log_date,job_name,status,req_start_date from all_scheduler_job_run_details where job_name like '%20200523081403';
LOG_DATE		JOB_NAME		        STATUS		REQ_START_DATE
23-MAY-20 08.42.41	AMQ_P3J147_20200523081403	SUCCEEDED	23-MAY-20 02.42.32
23-MAY-20 08.42.32	AMQ_P2J146_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.56	AMQ_P2J145_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.33	AMQ_P2J144_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.22	AMQ_P2J143_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.37.03	AMQ_P2J141_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.36.50	AMQ_P2J142_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
23-MAY-20 08.33.57	AMQ_P2J140_20200523081403	SUCCEEDED	23-MAY-20 02.23.13
--Only the first 8 rows are displayed

To resume :

  • We have 148 jobs all started, most of the time in parallel (job with same REQ_START_DATE, oracle parallelizes jobs per block randomly).
  • My PL/SQL process now took 27.57 mins instead of 40mins.

But if we have a look in details, we have a lot of small jobs. Those are jobs where run_duration is less than 01 mins:

SQL> select run_duration from all_scheduler_job_run_details where job_name like '%20200523081403' order by run_duration;

RUN_DURATION
+00 00:00:04.000000
+00 00:00:07.000000
+00 00:00:09.000000
+00 00:00:10.000000
+00 00:00:13.000000
+00 00:00:15.000000
+00 00:00:20.000000
+00 00:00:27.000000
+00 00:00:33.000000
+00 00:00:35.000000
+00 00:00:36.000000
+00 00:00:38.000000
+00 00:00:43.000000
+00 00:00:46.000000
+00 00:00:51.000000
+00 00:00:52.000000

As we have a lot of small jobs (short-lived jobs), it will be more interesting to use lightweight jobs instead of regular jobs.

In contrary of regular jobs, lightweight jobs :

  • Require less meta data, so they have quicker create and drop times.
  • Suited for short-lived jobs (small jobs, jobs where run_duration is low).

Let’s rewrite our PL/SQL process using lightweight jobs :

To use lightweight jobs, first create a program suitable for a lightweight job :

begin
dbms_scheduler.create_program
(
    program_name=>'LIGHTWEIGHT_PROGRAM',
    program_action=>'LOAD_AMQ',
    program_type=>'STORED_PROCEDURE',
    number_of_arguments=>6, 
    enabled=>FALSE);
END;

Add the arguments (parameters) and enable the program :

BEGIN
dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>1,
argument_type=>'VARCHAR2',
DEFAULT_VALUE=>NULL);

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>2,
argument_type=>'VARCHAR2');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>3,
argument_type=>'NUMBER');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>4,
argument_type=>'NUMBER');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>5,
argument_type=>'VARCHAR');

dbms_scheduler.DEFINE_PROGRAM_ARGUMENT(
program_name=>'lightweight_program',
argument_position=>6,
argument_type=>'NUMBER');

dbms_scheduler.enable('lightweight_program');  
end;

Into the PL/SQL code, let’s create the lightweight job without forget to set the argument value before running the job:

DECLARE
...
BEGIN
....
LOOP
DBMS_SCHEDULER.create_job (
job_name        => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
program_name    => 'LIGHTWEIGHT_PROGRAM',
job_style       => 'LIGHTWEIGHT',
enabled         => FALSE);
                  
 DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 1,
   argument_value          => rec.scope);
   
DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 2,
   argument_value          => l_amq_name);
   
DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 3,
   argument_value          => rec.priority);

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 4,
   argument_value          => l_date);   

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 5,
   argument_value          => v_SESSION_ID);  

DBMS_SCHEDULER.SET_JOB_ARGUMENT_VALUE (
   job_name                => 'AMQ_P'||rec.priority||'j'||i||'_'||l_date,
   argument_position       => 6,
   argument_value          => i); 

dbms_scheduler.run_job('AMQ_P'||rec.priority||'j'||i||'_'||l_date,TRUE);
...
END LOOP;
Number of Rows Read : 148 (Means 148 Sql Statement to execute) 
START : 18:08:56
END : 18:27:40
Total : 18.84 mins

 

Let’s check we have always 148 jobs in parallel :

SQL> select count(*) from all_scheduler_job_run_details where job_name like '%20200524175036';

  COUNT(*)
----------
       148
SQL> select log_date,job_name,status,req_start_date from all_scheduler_job_run_details where job_name like '%20200524175036';

LOG_DATE           JOB_NAME     STATUS	        REQ_START_DATE
24-MAY-20 05.50.51 AB1C		SUCCEEDED	24-MAY-20 05.50.36
24-MAY-20 05.50.56 AB1D		SUCCEEDED	24-MAY-20 05.50.51
24-MAY-20 05.51.14 AB1E		SUCCEEDED	24-MAY-20 05.50.56
24-MAY-20 05.51.49 AB1I		SUCCEEDED	24-MAY-20 05.51.14
24-MAY-20 05.52.14 AB1P		SUCCEEDED	24-MAY-20 05.51.49
24-MAY-20 05.52.34 AB1L		SUCCEEDED	24-MAY-20 05.52.14
24-MAY-20 05.52.55 AB1N		SUCCEEDED	24-MAY-20 05.52.34
24-MAY-20 05.53.17 AB1M		SUCCEEDED	24-MAY-20 05.52.55
24-MAY-20 05.53.29 AB1K		SUCCEEDED	24-MAY-20 05.53.17
24-MAY-20 05.53.39 AB1O		SUCCEEDED	24-MAY-20 05.53.29
24-MAY-20 05.53.57 AB1U		SUCCEEDED	24-MAY-20 05.53.39
24-MAY-20 05.54.07 AB1V		SUCCEEDED	24-MAY-20 05.53.57

To resume :

  • We have 148 jobs all started, most of the time in parallel.
  • My PL/SQL process now took 18.54 mins (Lightweight Jobs) instead of 27.57mins (Regular Jobs).
  • If we compare Regular Jobs VS Lightweight Jobs, the former seems to schedule the jobs randomly (start jobs with block of 4,5,6…8) while the last one schedule jobs by block of 3 or 4 (as we can see above).

Conclusion :

  • DBMS_SCHEDULER (Regular Jobs or Lightweight Jobs) can improve significantly your PL/SQL performance transforming transforming your serial process in parallel process.
  • If you have small jobs (short lived-jobs), use lightweight jobs instead regular jobs.
  • Don’t underestimate the development time (development, test, bug solving) to transform your serial process to parallel process. Create 1 job is different to create more than 100 or 1000 jobs through a PL/SQL loop (concurrency problem, CPU used by create/drop the jobs).
  • As developer, you are responsible to manage your jobs (create,drop,purge) in order to not fill the oracle parameter job_queue_processes (used by a lot of critical oracle processes).

Cet article How to use DBMS_SCHEDULER to improve performance ? est apparu en premier sur Blog dbi services.

PostgreSQL Shared Buffers vs free RAM

Wed, 2020-05-20 08:54

PostgreSQL, like all other database engines, modifies the table and index blocks in shared buffers. People think that the main goal of buffered reads is to act as a cache to avoid reading from disk. But that’s not the main reason as this is not mandatory. For example PostgreSQL expects that the filesystem cache is used. The primary goal of shared buffers is simply to share them because multiple sessions may want to read a write the same blocks and concurrent access is managed at block level in memory. Without shared buffers, you would need to lock a whole table. Most of the database engines use the shared buffers for caching. Allocating more memory can keep the frequently used blocks in memory rather than accessing disk. And because they manage the cache with methods suited to the database (performance and reliability) they don’t need another level of cache and recommend direct I/O to the database files. But not with Postgres. In order to keep the database engine simple and portable, PostgreSQL relies on the filesystem cache. For example, no multiblock read is implemented. Reading contiguous blocks should be optimized by the filesystem read-ahead.
But this simplicity of code implies a complexity for configuring a PostgreSQL database system. How much to set for the shared_buffers? And how much to keep free in the OS to be used for the filesystem cache?

I am not giving any answer here. And I think there is, unfortunately, no answer. The documentation is vague. It defines the recommended value as a percentage of the available RAM. That makes no sense for me. The cache is there to keep frequently used blocks and that depends on your application workload. Not on the available RAM. There is also this idea that because there is double buffering, you should allocate the same size to both caches. But that makes no sense again. If you keep blocks in the shared buffers, they will not be frequently accessed from the filesystem and will not stay in the filesystem cache. Finally, don’t think the defaults are good. The default shared_buffers is always too small and the reason is that a small value eases the installation on Linux without having to think about SHM size. But you have more, need more, and will need to configure SHM and huge pages for it.

My only recommendation is to understand your workload: what is the set of data that is frequently used (lookup tables, index branches, active customers, last orders,…). This should fit in the shared buffers. And the second recommendation is to understand how it works. For the set of data that is frequently read only, this is not so difficult. You need to have an idea about the postgres cache algorithm (LRU/Clock sweep in shared buffers which is static size) and the filesystem cache (LRU with the variable size of free memory). For the set of data that is modified, this much more complex.

Let’s start with the reads. As an example I have run a workload that reads at random within a small set of 150MB. I used pgio from Kevin Closson for that. I’ve executed multiple runs with warying the shared_buffers from smaller than my work set, 50MB, to 2x larger: 50MB, 100MB, 150MB, 200MB, 250MB, 300MB. Then, for each size of shared buffers, I’ve run with variations in the available RAM (which is used by the filesystem cache): 50MB, 100MB, 150MB, 200MB, 250MB, 300MB, 350MB, 400MB. I ensured that I have enough physical memory so that the system does not swap.

I determined the free space in my system:


[ec2-user@ip-172-31-37-67 postgresql]$ sync ; sync ; sync ; free -hw
              total        used        free      shared     buffers       cache   available
              total        used        free      shared     buffers       cache   available
Mem:           983M         63M        835M        436K          0B         84M        807M
Swap:            0B          0B          0B

I also use https://medium.com/@FranckPachot/proc-meminfo-formatted-for-humans-350c6bebc380


[ec2-user@ip-172-31-37-67 postgresql]$ awk '/Hugepagesize:/{p=$2} / 0 /{next} / kB$/{v[sprintf("%9d MB %-s",int($2/1024),$0)]=$2;next} {h[$0]=$2} /HugePages_Total/{hpt=$2} /HugePages_Free/{hpf=$2} {h["HugePages Used (Total-Free)"]=hpt-hpf} END{for(k in v) print sprintf("%-60s %10d",k,v[k]/p); for (k in h) print sprintf("%9d MB %-s",p*h[k]/1024,k)}' /proc/meminfo|sort -nr|grep --color=auto -iE "^|( HugePage)[^:]*" #awk #meminfo                                                                           33554431 MB VmallocTotal:   34359738367 kB                    16777215
      983 MB MemTotal:        1006964 kB                            491
      946 MB DirectMap2M:      968704 kB                            473
      835 MB MemFree:          855628 kB                            417
      808 MB MemAvailable:     827392 kB                            404
      491 MB CommitLimit:      503480 kB                            245
      200 MB Committed_AS:     205416 kB                            100
       78 MB DirectMap4k:       79872 kB                             39
       69 MB Cached:            71436 kB                             34
       55 MB Active:            56640 kB                             27
       43 MB Inactive:          44036 kB                             21
       42 MB Inactive(file):    43612 kB                             21
       29 MB Slab:              30000 kB                             14
       28 MB AnonPages:         29260 kB                             14
       28 MB Active(anon):      29252 kB                             14
       26 MB Active(file):      27388 kB                             13
       14 MB SUnreclaim:        14876 kB                              7
       14 MB SReclaimable:      15124 kB                              7
       13 MB Mapped:            14212 kB                              6
        4 MB PageTables:         4940 kB                              2
        2 MB Hugepagesize:       2048 kB                              1
        1 MB KernelStack:        1952 kB                              0
        0 MB Shmem:               436 kB                              0
        0 MB Inactive(anon):      424 kB                              0
        0 MB HugePages Used (Total-Free)
        0 MB HugePages_Total:       0
        0 MB HugePages_Surp:        0
        0 MB HugePages_Rsvd:        0
        0 MB HugePages_Free:        0
        0 MB Dirty:                 4 kB                              0

Then in order to control how much free RAM I want to set ($fs_MB) I allocate the remaining as huge pages:


sudo bash -c "echo 'vm.nr_hugepages = $(( (835 - $fs_MB) / 2 ))' > /etc/sysctl.d/42-hugepages.conf ; sysctl --system"

This limits the RAM available for the fileystem cache because huges pages cannot be used for it. And the huges pages can be used for the postgres shared buffers:


sed -i -e "/shared_buffers/s/.*/shared_buffers = ${pg_MB}MB/" -e "/huge_pages =/s/.*/huge_pages = on/" $PGDATA/postgresql.conf
grep -E "(shared_buffers|huge_pages).*=" $PGDATA/postgresql.conf
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile restart

Note that I actually stopped postgres before to be sure that no huge pages are used when resizing them:


for pg_MB in 50 100 150 200 250 300 350 400
do
for fs_MB in 400 350 300 250 200 150 100 50
do
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile stop
# set huge pages and shared buffers
/usr/local/pgsql/bin/pg_ctl -l $PGDATA/logfile restart
# run pgio runit.sh
done
done
100% Read only workload on 150MB

Here is the result. Each slide on the z-axis is a size of shared-buffers allocated by postgres. On the x-axis the size of the available RAM that can be used for filesystem cache by the Linux kernel. the y-axis is the number of tuples read during the run.

You will never get optimal performance when the frequent read set doesn’t fit in shared buffers. When the read set is larger than the shared buffers, you need more RAM in order to get lower performance. The frequently read set of data should fit in shared buffers.

50% updates on 150MB

Here is the same run where I only changed PCT_UPDATE to 50 in pgio.conf

This looks similar but there are two main differences, one visible here and another that is not represented in this graphic because I aggregated several runs.

First, increasing the shared buffers above the set of frequently manipulated data still improves performance, which was not the case with reads. As soon as the shared buffer is above the working set of 150MB the buffer cache hit ratio is at 100%. But that’s for reads. Updates generate a new version of data and both versions will have to be vacuumed and checkpoint.

Here is a graph about blks_read/s which shows that for a read-only workload we do not do any physical reads (I/O calls from the database to the filesystem) as soon as the working set fits in shared buffers. When we write, the read calls still improve when we increase the shared buffers a bit above the modified set. And the physical read efficiency is the best when there is as much free RAM as shared buffers.

Second point about the write workload, performance is not homogenous at all. Vacuum and Checkpoint happen at regular intervals and make the performance un-predictable. When showing the tuples/second for the 50% write workload, I aggregated many runs to display the maximum throughput achieved. Having the modified set of data fitting in free RAM helps to lower this variation as it avoids immediate physical write latency. The balance between shared buffers and free RAM is then a balance between high performance and predictability of performance: keep free ram as a performance “buffer” between userspace access and disk reads. There are also many parameters to “tune” this like with the frequency of vacuum and checkpoints. And this makes memory tuning very complex. Note that I changed only the shared_buffers here. When increasing shared_buffers for high write throughput, there are other structures to increase like the WAL segments.

The filesystem cache adds another level of un-predictable performance. For example, you may run a backup that reads all the database, bringing the blocks into the Least Recently Used. And I didn’t do any sequential scans here. They benefit from filesystem buffers with pre-fetching. All theses make any quick recommendation incorrect. Buffer cache hit ratios make no sense to estimate the response time improvement as they are not aware of the filesystem cache hits. But looking at them per relation, tables or indexes may help to estimate which relation is frequently accessed. Because that’s what matters: not the size of the database, not the size of your server RAM, not the general buffer cache hit ratio, but the size of data that is read and written frequently by your application.

Cet article PostgreSQL Shared Buffers vs free RAM est apparu en premier sur Blog dbi services.

Pages