International Workshop on Big and Open Data: Evolving Data Science ...

3 downloads 422 Views 13MB Size Report
International Workshop on Big and Open Data: Evolving. Data Science Standards & Citation Attribution Practices,”. November 5-6, 2015,. Venue: Indian National ...
International Workshop on Big and Open Data: Evolving Data Science Standards & Citation Attribution Practices,”

November 5-6, 2015, Venue: Indian National Science Academy,

The

International Nuclear Information System (INIS) Ref: https://www.iaea.org/inis/ hosts one of the world's largest collections of published information on the peaceful uses of nuclear science and technology. It offers online access to a unique collection of non-conventional literature. INIS is operated by the IAEA in collaboration with over 150 members.

BIG DATA SCIENCE: Generation and use of big numerical databases in pursuit of excellence in science taking more detailed features in multiscale multi-physics modelling into account. Use of big scientific databases help do better science as knowledge progresses with increasing complexities in data collections by costly micro (differential) and integral experiments. Big data science is a natural consequence of advances in computer hardware and software, in experimental techniques, analysis, and in multiscale, multiphysics simulations. (“Old wine in a new bottle”). It humbles us.

Internationally, the evolution of the subject of big scientific data bases and knowledge management has broadly included six+ base technology efforts (Next slide) in a generic sense across disciplines, using advances in computer science and numerical similations, physics and technology.

1. Measurements. “Raw data”. Physics is observational. 2. Systematic compilations of raw experimental data. Challenges in managing this task. No technical judgment on quality of data. Deep technical knowledge is needed. 3. Evaluation (“Evaluation of data” is not understood by many in India; mistaken for model calculations) of numerical data; Includes science based models, systematics and statistical / mathematical inference tools 4. Visualization and processing of large data bases for Monte Carlo applications 5. Applications: e.g., for multiphysics-multiscale modelling MMM3. 6. Integral results based upon physical real system experiments (How to reduce the number of integral expts?)

Citation practices are required to be followed in each of the above tasks. Publications/documentations/write up on data science work should be made by workers in appropriate journals or in laboratory reports.

Nuclear data Science Internationally, the evolution of the subject of preparation of working nuclear data libraries has included five basic technologies: 1. Nuclear data physics experiments; Cross section measurements, covariances 2. Measured raw data compilations. Covariances; Digitization 3. Cross-section evaluations includes nuclear models, statistical tools 4. Cross section processing 5. Integral experiments (How to reduce the number of integral expts?) 6. Neutron-photon coupled transport calculations and response functions. (reactor Design with plug-in nuclear libraries). Use of covariances to define error margins due to uncertainties in nuclear data The process of getting the working libraries for design calculations requires an iterative sequence of events to yield a quality assured transport cross section library. Steps 1 to 5 involve doing science with efforts of magnitude 3 orders or more, than reactor design work that starts from plug in nuclear data libraries.

















The nuclear data physics efforts have helped eliminating the pitfalls associated with individual’s limited efforts to knowledge-manage the entire knowledge base related to a specific study/application. The “big data science” helps to increase system intelligence and system performance bringing in factors of correlations. Affordable due to advances in computer science to great scales.

•Reactor design issues are shaped by a number of issues and considerations, such as materials development, designability, passive safety inclusions etc. •One of the important point is that the basic nuclear data physics research and data science associated has been essential in shaping concepts of nuclear power reactor designs. •Nuclear data physics efforts are base technology efforts and are an essential part of MMM3 (multiphysics multi-scale modeling) of nuclear systems for all energy and non-energy applications

The challenging issue in multiphysics, multiscale modeling is that the correct physics information should pass from one scale to the next with full consistency of the physical laws and with no break in continuity. The multiphysics, multiscale modeling should enable us to “zoom in” on regions that are particularly sensitive to certain parameters.

The obvious advantage, if one is able to make progress in multiphysics, multiscale modeling, is that a number of costly integral experiments that would otherwise require several years to conduct can be significantly reduced. A successful programme of multiphysics, multiscale modeling in current and future nuclear reactors involves a well coordinated scientific team work with several disciplines participating over a long term. The culture of the team work to effectively make progress has to be nurtured. The evolution of this strategy also involves fixing a number of basic physics data that goes in the neutronics, thermal hydraulics, radiation damage, chemical changes etc., with reasonable accuracy and performing a large number of coupled sensitivity studies to identify areas and needs of experiments both basic and applied in each of the disciplines.

The resulting design document of a plant using a perfect multiphysics, multiscale modeling is a dream come true for any plant operator. The confidence to take up multiphysics, multiscale modeling has arisen in the history of nuclear energy and other areas because of improved computer resources and software developments coupled with the ability to continuously update the basic physics data bases to unprecedented details and accuracy and integral data bases at different integral levels.

The data science activities in nuclear data science tailored to the Indian Atomic Energy applications are generically useful in other areas of sciences, in the Indian context. There is a need to horizontally share the data science expertise across various areas /disciplines.

http://www.iaea.org/inis/collection/NCLCollectionStore/_Public /28/015/28015549.pdf “CITATION GUIDELINES FOR NUCLEAR DATA RETRIEVED from databases Resident at the “Nuclear data Center’s Network” By Viktoria McLane July 1996, BNL, USA Lab Report: BNL-NCS-63381 My Observations: Citation guidelines are followed by experts in nuclear data but not necessarily by the users, though the situation is improving.

Extracted from “CITATION GUIDELINES FOR NUCLEAR DATA RETRIEVED from databases Resident at the “Nuclear data Center’s Network” By Viktoria McLane, July 1996, BNL, USA, Lab Report: BNLNCS-63381 “Data source Data obtained from the databases residing at the member organizations of the Nuclear Data Centers Network should be properly cited. In general, there should be a citation of the original source of the information used, as well as of the database from which the data were extracted. The source of the information should be cited as: • Data retrieved (or extracted) from the (center name) Online Data Service, • Data retrieved (or extracted) from the (center name) WorldWideWeb site, • Data received by electronic file transfer from (center name), • Data received from (center name) .These data bases may contain essential information which does not exist in a published article. Since the databases are periodically updated, it is important to include the date and/or revision number of the version of the database used.”

My comment: To help citation efforts, there is a need to give clear instructions on how to cite a given data science work: Example:

The Japanese nuclear data evaluation centre clearly provides guidelines in their data centre website on how to cite the evaluated nuclear data file JENDL-4.0 See next slide

• – – • – – – – – – –

• • • • •

Three Stage Nuclear Power Programme Programme-- Present Status: 21 operating; 6under construction) 95 90

84 84 86

85

A v a ila b ility

80 75

90

91 90

89

79 72

75

69

70 65 60 55 50 1995- 1996- 1997- 1998- 1999- 2000- 2001- 2002- 2003- 2004- 200596 97 98 99 00 01 02 03 04 05 06

Stage – I PHWRs 18 Operating • 4 x 700 MWe - Under construction • Several others planned • Gestation period has been reduced • POWER POTENTIAL ≅ 10,000 Mwe LWRs • 2 BWRs Operating • 1 VVER (1000 MWe MWe)) operating and 1 VVER under advanced stage of construction (> 95% )

Stage - II Fast Breeder Reactors • 40 MWth FBTR Operating since 1985, Technology Objectives realized. PFBR-• 500 MWe PFBR Under Construction (> 96%) . • StageStage-II POWER POTENTIAL : ≅ 530,000 MWe

Stage - III Thorium Based Reactors • 30 kWth KAMINIKAMINIOperating • 300 MWe AHWR AHWR-Under Development POWER POTENTIAL FOR STAGESTAGE-III IS VERY LARGE Availability of ADS can enable early introduction of Thorium and enhance capacity growth rate.

http://www.npcil.co.in/main/AllProjectOperationDisplay.aspx Nuclear Power Generation (2006-07 to 2015-16) Year 2015-16 (Upto Aug 2015) 2014-15 2013-14 2012-13 2011-12 2010-11 2009-10 2008-09 2007-08 2006-07

Gross Generation (MUs)

Capacity Factor (%)

Availability Factor (%)

16311

78

82

37835 35333 32863 32455 26472 18803 14927 16930 18634

82 83 80 79 71 61 50 54 63

88 88 90 91 89 92 82 83 85

http://www.npcil.co.in/main/AllProjectOperationDisplay.aspx

http://www.npcil.co.in/main/AllProjectOperationDisplay.aspx

STATUS, INDIA 05 OCT., 2015: INSTALLED CAPACITY: 5780 Mwe (by 21 reactors) Over 3.5% of total electricity produced in India

5 Thermal REACTORS ARE UNDER CONSTRUCTION

http://www.npcil.nic.in/main/ProjectConstructionStatus.aspx In addition, one 500 MWe Sodium cooled fast reactor is poised to be commissioned soon (next slide)

OFFICIAL SITE: http://www.bhavini.nic.in/Userpages/ViewProject.aspx

STATUS of 500 MWe Prototype Fast Breeder Reactor (PFBR) Project completion status: > 97%. Soon poised for commissioning.

Source: P.K. Vijayan, ThEC-2013, 27-31 October, CERN, Geneva.

Personal perspectives: In my humble opinion, FUKUSHIMA would have been avoided if BIG DATA SCIENCE were followed, which would have included a worldwide and comprehensive expertise to make reactors safer. Use of big data science in india will accelerate progress to reduce gap in human development index between India and other developed countries.

An Interesting Example of an Indian Operating PHWR Influenced by Need To Use Update Nuclear Data in Design Manuals

BETTER NUCLEAR DATA For safe operation of existing reactors: A practical example In 2004, an incident involving power rise took place in KAPS, Unit 1. Nat- UO2, D2O, PHWR 220 MWe unit. A public release dated April 22, 2004 by the Atomic Energy Regulatory Board provides the details of this incident. www.aerb.gov.in/prsrel/prsrel.asp

On March 10, 2004, KAPS-1 experienced an incident involving incapacitation of reactor regulating system, leading to an unintended rise in reactor powerfrom 73%FP to near 100%FP, with trip occuring on Steam Generator DELTA T High Level 2 on INES Scale.

9 :0 8 :0 0 9 :0 9 :1 0 9 :1 0 :2 0 9 :1 1 :3 0 9 :1 2 :4 0 9 :1 3 :5 0 9 :1 5 :0 0 9 :1 6 :1 0 9 :1 7 :2 0 9 :1 8 :3 0 9 :1 9 :4 0 9 :2 0 :5 0 9 :2 2 :0 0 9 :2 3 :1 0 9 :2 4 :2 0 9 :2 5 :3 0 9 :2 6 :4 0 9 :2 7 :5 0 9 :2 9 :0 0 9 :3 0 :1 0

P O W E R (% F P )

The FTC is due to the combined effect of Doppler effect and fuel re-thermalization KAPS-1 PSS ION CHAMBERS READING DURING POWER INCREASE effect. In a Pressurized Heavy Water Reactor, the 100 precise cross-over point in burnup where the 95 FTC becomes positive depends on many 90 CH-D % FP parameters such as the temperature range CH-E % FP 85 and 19 versus 37 rod cluster. CH-F % FP 80 The 27 group wims1981 library has a cross 75 over point, for FTC at about 12000MWD/Te burnup; at about 9400MWD with the same but 70 69-group library, at about 6000MWD for a 19 rod cluster with the new “iaea.lib” library and at about 4500MWD for 37 rod cluster of TIME PHWR with the “iaea.lib” library. The KAPS-1 overpower transient could be explained only with the use of new WLUP libraries. A practical exampleAn incident involving power rise took place in KAPS, Unit

1. Nat- UO2, D2O, PHWR 220 MWe unit. A public release dated April 22, 2004 by the Atomic Energy Regulatory Board provides the details of this incident. On March 10, 2004, KAPS-1 experienced an incident involving incapacitation of reactor regulating system, leading to an unintended rise in reactor powerfrom 73%FP to near 100%FP, with trip occurring on Steam Generator DELTA T High Level 2 on INES Scale.

Exotic species (no experimental data) Large number of nuclei and properties involved



















$

$

$

$

$







%

!

















# 

#

#

%

!











# 

#

#



"



"



"



















"



















!









































































































 













































































































































τ Γ



































































































EXAMPLE of an EXFOR entry of an Indian nuclear physics experiment Determination of the Pa233(n,f) reaction cross section from 11.5 to 16.5 MeV neutron energy by the hybrid surrogate ratio approach B. K. Nayak, A. Saxena, D. C. Biswas, E. T. Mirgule, B. V. John, S. Santra, R. P. Vind, R. K. Choudhury, and S. Ganesan Phys. Rev. C 78, 061602(R) – Published 12 December 2008





233

Pa(n,f) Cross Section

2.0 2004 Tovesson 2004 Petit Present Experiment Empire 2.19 (With Barrier Formula)

Cross Section (barns)

1.5

1.0

0.5

0.0

-0.5 0

5

10 Incident Energy (MeV)

232Th(6Li,

a)→234Pa

232Th(6Li,

d)→236U

15

20

International Network of Nuclear Reaction Data Centres (NRDC)

Indian EXFOR Workshops is a role model that can be tried as a follow-up in other areas. Tools used in compilation such as digitizers, evolution of common formats (XML). Considerable training needed. Has to be ongoing as a national activity in each area of topic of data science.

An example of EXFOR data retrived for the nuclear reaction 232Th(n,gamma)233Th



GROUP PHOTO EXFOR COMPILATION WORKSHOP JAIPUR-2009 Brought together: Experimentalists, Theoretical nuclear physicists Radiochemists Health Physicists Mathematicians Rector design physicists PhD stuednts M.Sc students Programmers











































































Prof. Alok Saxena, Technical Convener of this EXFOR-2015 event, Prof. Rudraswamy, Local Convener, Dr. Ramakrishna Damle and all colleagues in the programme committee.







E: experimental value: Benchmark experiment: (E) C: Calculated value of integral parameter (k-eff, control rod worth, void coefficient, fuel temperature coefficient, shield thickness, burnup evolution, decay heat, foil dosimetry based neutron spectrum determination etc.) Nuclear data uncertainties dominate other errors in many cases. C ± SQRT[VAR (C)] --------------------------E ± SQRT[VAR (E)] •Var (C) has many components---Arising from many causes, such as, •Nuclear data •Approximations in modelling reactor calculations, such as •Uncertainty in geometry, •Numerical approximations, •Treatment and collapsing of nuclear data (multigroup) etc . •Var (E): error in experiment •Plus •Uncertainty in system characterization; [Benchmark uncertainty].

Concerns on addressing issues of assessment in uncertainties in data and its impact on applications in research and and in economical and safe operation of systems using the data.

REACTOR SYSTEM OUTPUT PARAMETERS, THE {R} VECTOR: R1, R2, R3,…………RN R consists of integral parameters such as k-eff, control rod worth, void coefficient, fuel temperature coefficient, shield thickness, Burnup (nuclides) evolution, decay heat, foil dosimetry based spectrum determination etc.

Mindset: PDFs : P(Rk)dRk Mean of Rk : < Rk > and cov (Ri, Rj)

MATHEMATICAL FORMULATION OF ERROR ANALYSIS

INPUT PARAMETERS X1, X2, X3,..XN (NUCLEAR DATA PDFs : P(Xk)dXk : < Xk > and cov (Xi, Xj). Very high rank matrix

OUTPUT PARAMETERS R1, R2, R3,..RN PDFs : P(Rk)dRk < Rk > and cov (Ri, Rj) PDFs are approximated by a normal distribution with known estimates of mean and variance

Use of covariances in nuclear data Systematically investigate the influence of nuclear data uncertainties on the results of calculations of integral parameters for nuclear reactors. Quantitative uncertainty analyses.

Sandwich formula: See for instance in http://www.nada.kth.se/~kaia/papers/arrasTR-9801-R3.pdf

Indian efforts on nuclear data covariances Phase-I of a NDPCI project

• • •

No convergence among countries recommended data India is new to the concept of nuclear data evaluations

The 232Th(n, 2n) reaction cross section in 172 groups from IAEAGX library of the IAEA WIMS LIBRARY UPDATE PROJECT: http://www-nds.iaea.org/wimsd/

EA-ADS: ~700ppm (Calc) PHWR-India: 550ppm (Exptl) AHWR-India: 1700ppm (Calc) Fast Breeder Test Reactor : Calc : 5-10 ppm

This 232Th(n,2n) reaction though occurs above 6.3MeV even in thermal reactors where the flux above 6MeV is just 0.05%. The reaction provides proliferation resistance. To assess the uncertainty in calculations of the formation of 232U in thorium we need covariances in flux and (n,2n) cross section data.

Papers were published after a peer review in Nucl. Data Sheets, Jan. 2015 Issue

Also, the slides presented by workshop participants at the International Workshop on Nuclear Data Covariances, Santra Fe, NM, USA, 28 April to 1 May, 2014 (organized by Los Alamos National Laboratory) are available online:

http://www.osti.gov/scitech/biblio/1136943

CONCLUDING REMARKS The talk is on numerical databases citation needs and procedures taking nuclear data science R & D as an example. These are not bibliographic databases The citation protocols are well defined but Indian scientists and engineers need to be sensitized. This workshop is timely. Internationally, the evolution of the subject of big scientific data bases and knowledge management has broadly included several identified base technology efforts in a generic sense across disciplines, using advances in computer science and numerical similations, physics and technology. Error specifications in scientific data is essential.

The data science activities in nuclear data science tailored to the Indian Atomic Energy applications are generically useful in other areas of sciences, in the Indian context. There is a need to horizontally share the data science expertise across various areas /disciplines. Data compilations workshops are needed to be encouraged in every discipline.

A BIG THANK YOU FOR THIS VERY INTERESTING BIG DATA SCIENCE WORKSHOP